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REMARKS 

Status of the Claims 

Claims 1-23 were rejected. Claims 12-18, 20 and 21 have been canceled without 
prejudice or disclaimer. Applicant reserves the right to pursue subject matter of claims 12-18, 20 
and 21 in a continuation or divisional application. Claims 1-11, 19, 22, and 23 are pending. 

Claims 1, 2, 3, 1 1, 19, 22, and 23 have been amended. To expedite prosecution, claim 1 
has been amended without prejudice or disclaimer to delete the subject matter drawn to a 
complement of the nucleic acid sequences of (a) through (d). Applicant reserves the right to 
pursue deleted subject matter of claim 1 in a continuation or divisional application. Claims 2, 3, 
1 1, 19, 22 and 23 have been amended to more clearly define the scope of the invention. No new 
matter has been entered by way of these amendments. 

The Objection to the Specification Should Be Withdrawn 

The specification was objected to for not complying with the sequence listing 
requirements. The sequence at the bottom of page 37 (line 27) has been amended to recite a 
sequence identifier. This sequence has been added to the sequence listing as SEQ ID NO: 12. 

No new matter has been added by way of this addition. The sequence listing and 
specification are now in compliance with 37 CFR 1.821-1.825 and the objection should be 
withdrawn. 

The Objection to the Claims Should Be Withdrawn 

The Examiner has objected to Claim 1 1 for improper article usage. Claim 1 1 has been 
amended to recite "the plant" and is, therefore, in proper format. Claim 2 was objected to for 
failing to limit the subject matter of a previous claim from which it depends. Claim 2 has been 
amended to recite "the isolated" and, as such, properly limits the scope of claim 2. Accordingly, 
it is respectfully requested that the objection to claims 2 and 1 1 be withdrawn. 
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Amendments to the Drawings: 

The drawings were objected to for legibility and quality of letters in the figures. The 
Examiner refers specifically to the "strange quality of the letters in Figure IB" (Page 2, Item 2 of 
the February 14, 2006 Office Action). However, there is no figure labeled "IB" in this 
application. The Applicants assume the Examiner is referring to the lettering in Figure 2. 

Applicants have submitted herewith, replacement Figures 1 and 2 (Appendix D) in which 
the darkened boxes have been removed from Figure 1, and the font of the text of Figure 2 has 
been amended to improve legibility and clarity. Accordingly, the objection to the drawings 
should be withdrawn. 
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Amendments to the Specification : 

Please replace the paragraph beginning on page 3 line 29 and continuing through page 4 
line 8 with the following replacement paragraph: 

Figures 1 AandB shows an alignment of AXMI-004 (SEQ ID NO:3) with cryl Ac (SEQ ID 
NO:6), crylCa (SEQ ID NO:7), cry2Aa (SEQ ID NO:8), cry3Aal (SEQ ID NO:9), cryl la (SEQ 
ID NO: 10), and cry7Aa (SEQ ID NO:l 1). Toxins having C-terminal non-toxic domains were 
artificially truncated as shown. Tho alignment showo tho moat highly cons e rved amino acid 
residues highlighted in black, and highly conserved amino acid r e sidues highlighted in gray. 
Conserved group 1 is found from about amino acid residue 174 to about 196 of SEQ ID NO:3. 
Conserved group 2 is found from about amino acid residue 250 to about 292 of SEQ ID NO:3. 
Conserved group 3 is found from about amino acid residue 476 to about 521 of SEQ ID NO:3. 
Conserved group 4 is found from about amino acid residue 542 to about 552 of SEQ ID NO:3. 
Conserved group 5 is found from about amino acid residue 61 8 to about 628 of SEQ ID NO:3 

Please replace page 37, lines 16 through 27 with the following replacement paragraph: 
Example 9. N-terminal Amino Acid Sequence of AXMI-004 Exp ressed in Bacillus 

Analysis of AXMI-004 expressed in Bacillus suggested that the protein product detected 
in these cultures may be reduced in size relative to the full-length AXMI-004 protein. Since 
many endotoxin proteins are cleaved at the N-terminus after expression in Bacillus, we 
determined the N-terminus of the AXMI-004 protein resulting from Bacillus expression. Protein 
samples from AXMI-004 were separated on PAGE gels, and the protein transferred to PVDF 
membrane by methods known in the art. The protein band corresponding to AXMI-004 was 
excised. The N-terminal amino acid sequence of this protein was determined by serial Edman 
degradation as known in the art. The sequence obtained was as follows: 

ERFDKND ALE fSEO ID NO: 12) 
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working examples of the invention in the application, the nature of the invention, the state of the 
prior art, the relative skill of those in the art, the predictability in the art, and the breadth of the 
claimed invention. Id. Accordingly, the holding of Wands does not require that Applicants 
provide as working examples every pesticidal polypeptide that could be used to practice the 
present invention. Rather, Wands sets out factors to be considered in determining whether undue 
experimentation is required to make and use the invention. 

The Examiner argues that the specification does not enable one of skill in the art to make 
and use nucleic acids that encode polypeptides that retain pesticidal activity and have at least 
95% sequence identity to SEQ ID NO:l, 2, or 4, or 95% sequence identity to a nucleotide 
sequence that encodes SEQ ID NO: 3 or 5. The Examiner incorrectly bases this conclusion solely 
on the number of possible nucleic acids having the recited percent identity to SEQ ID NO:l, 2 or 
4, or a nucleotide sequence encoding SEQ ID NO: 3 or 5 while ignoring the other factors set forth 
in Wands for assessing whether undue experimentation is required. In particular, the Examiner 
has improperly discounted the guidance provided in the specification and the working examples 
set forth in the application (page 4 of the Office Action mailed February 14, 2006). 

First, sufficient guidance for making and using the recited sequences is present in the 
specification. The claimed variants and fragments of SEQ ID NO:l, 2, or 4, or nucleotide 
sequences encoding SEQ ID NO:3 or 5 are limited by a percent identity (i.e., 95% identity) and 
further limited by the functional requirement that they possess pesticidal activity. Guidance for 
preparing variants and fragments of SEQ ID NO:l, 2, or 4, or nucleotide sequences encoding 
SEQ ID NO:3 or 5 and for determining percent identity is provided in the specification and 
generally known in the art. See page 8, lines 22-27, and pages 9-13. Numerous delta-endotoxins 
were also well known in the art at the time the application was filed. See Crickmore et ah (1998) 
Microbiol Molec. Biol. Rev. 62:807-813, which is incorporated by reference on page 2, lines 8-9 
and is submitted herewith as Appendix A, and Crickmore et ah (2004) Bacillus thuringiensis 
Toxin Nomenclature at lifesci.sussex.ac.uk/Home/Neil_Crickmore/Bt. The necessary molecular 
biology and mutagenesis techniques for preparing the variants and fragments of pesticidal 
sequences of the invention are routine. Moreover, methods for assessing the pesticidal activity 
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of a polypeptide are readily available in the art and provided in the specification. See, for 
example, page 11, lines 22-26 and Examples 7, 10, 1 1 and 12. 

In order to identify the pesticidal sequences encompassed by the present claims, one of 
skill in the art would only need to prepare variants and fragments of the nucleotide sequence of 
SEQ ID NO:l, 2, or 4, or a nucleotide sequence encoding SEQ ID NO:3 or 5, having the 
specified characteristics recited in the claims (e.g., at least 95% identity) and then assay these 
polypeptides for pesticidal activity. Routine methods for preparing variants and fragments and 
testing the resulting polypeptides for pesticidal activity are routine in the art and described in the 
specification. Although some experimentation is required to practice the claimed invention, it is 
now customary in the art to generate a large number of sequences and to test them in a large- 
scale assay for a desired function, and, therefore, such experimentation is not undue, particularly 
in view of the routine nature of the required methods. Contrary to the Examiner's conclusions, 
in order to identify variants and fragments of the nucleotide sequence of SEQ ID NO: 1 , 2, or 4, 
or a nucleotide sequence encoding SEQ ID NO: 3 or 5 that could be used in the invention, a 
person skilled in the art would only need to utilize standard molecular biology and mutagenesis 
techniques and routine screening tests for pesticidal activity. Therefore, given the level of skill 
and knowledge in the art, the availability of standard methods and assays, and the significant 
guidance provided in the specification, Applicants respectfully submit that the amount of 
experimentation required to identify delta-endotoxins and variants and fragments thereof having 
pesticidal activity and the structural features recited in the claims is routine, not undue. 

The Examiner further argues that mutation of sequences, even conservative substitutions, 
does not produce predictable results and, therefore, the specification is not enabling with respect 
to variants of the nucleotide sequence of SEQ ID NO: 1 , 2, or 4, or a nucleotide sequence 
encoding SEQ ID NO: 3 or 5. The Office Action cites Lazar et al (1988) Molecular and 
Cellular Biology 8:1247-1252 and Hill et al (1998) Biochem. Biophys. Res. Comm. 244:573-577 
in support of the general unpredictability of the art with respect to modification of nucleotide 
sequences. Each reference, however, simply teaches that alteration of highly conserved 
sequences will disrupt function. Lazar et al. teach that alterations in amino acid residues 47 and 
48 in TGF-alpha can alter the activity of the polypeptide. Contrary to the Examiner's 
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conclusion, the alteration in the polypeptide was specifically designed to occur at amino acid 
positions that are highly conserved in the EGF-like family of polypeptides. Similarly, the 
modified residues described by Hill et al were conserved among bacterial and plant ADP- 
glucose pyrophosphorylases. As set forth in the first line of the abstract, "[t]wo absolutely 
conserved histidines and a third highly conserved histidine are noted in eleven bacterial and plant 
ADP-glucose pyrophosphorylases" (emphasis added). These absolutely and highly conserved 
histidines were mutagenized and characterized in the paper. One of skill in the art would not be 
surprised that modification of one of these highly conserved amino acids would lead to the loss 
of function described by the authors. Applicants further note that the Lazar et al and the Hill et 
al references are directed to TGF-alpha and ADP-glucose pyrophosphorylase, neither of which 
has any relation to the pesticidal sequences of the present invention. Thus, the cited references 
do not support the Examiner's broad assertion of inherent unpredictability of protein function 
resulting from the mutation of the underlying nucleotide sequence. In fact, both references 
support Applicants' arguments that at the time the application was filed one of skill in the art 
could modify polypeptide sequences and test the resulting variants for biological activity. 

Furthermore, the specification provides guidance regarding conservative modifications 
that are unlikely to disrupt biological activity. See, for example, pages 11-12. Thus, by 
reference to a standard codon table, one of skill in the art could predict which modifications 
would not affect the biological activity of the encoded polypeptide. Also, the specification lists 
numerous examples of conserved residues that are not likely to tolerate substitution (see page 
13), delineates conserved domains characteristic of delta-endotoxin proteins (see page 4), and 
highlights conserved residues in the sequences of the invention (see Figure 1 as originally filed). 
The replacement figure does not highlight conserved residues, however, one of skill in the art 
would understand how to use the alignment provided in the replacement figure to identify 
conserved residues using, for example, the methods described in the instant specification (see 
pages 10-11). 

Moreover, as described above, Applicants have disclosed pesticidal sequences, and 
variants and fragments thereof, and the art was replete with additional delta-endotoxin sequences 
at the time the application was filed. Information relating to conserved regions of delta- 

12 of 21 

RTA01/2199106vl 



Appl.No.: 10/782,020 

Amdt. dated May 12, 2006 

Reply to Office Action of February 14, 2006 

endotoxins may be obtained from these sequences. A person of skill in the art would appreciate 
that comparison and alignment of known delta-endotoxin sequences may reveal information 
regarding appropriate sites or regions for modifications. By aligning these sequences, one may 
be able to identify conserved residues or regions within these proteins that are unlikely to tolerate 
mutation and still retain pesticidal activity. Methods for aligning sequences, such as by using the 
CLUSTAL algorithm, are described in the specification. See pages 9-11. 

In addition, detailed information about the structure of delta-endotoxins was known in the 
art. See, for example, Li et al (1991) Nature 353:815-821 (describing the crystal structure of the 
Cry3A protein), which is incorporated by reference on page 12 of the specification, and Morse et 
al (2001) Structure 9:409-417, both of which are submitted herewith (Appendices B and C, 
respectively). Delta-endotoxins are extremely well-characterized and related to each other to 
various degrees by similarities in their amino acid sequences and tertiary structures. A combined 
consideration of the published structural analyses of delta-endotoxins and the reported functions 
associated with particular structures, motifs, and the like indicates that specific regions of the 
toxin are correlated with particular functions and discrete steps of the mode of action of the 
protein. Thus, a rational scheme for determining the regions of a delta-endotoxin that would 
tolerate modification is provided. Based on the regions of delta-endotoxins that are conserved 
among protein family members, the skilled artisan could choose among possible modifications to 
produce polypeptides within the structural parameters set forth in the claims and then test these 
modified variants to determine if they retain pesticidal activity. In light of the guidance provided 
in the specification and the state of the art with respect to delta-endotoxins, a skilled artisan 
could readily conclude which amino acids are essential for structure and function and could 
envisage similar sequences that are 95% identical to the nucleotide sequence of SEQ ID NO: 1 , 2, 
or 4, or a nucleotide sequence encoding SEQ ID NO:3 or 5, and that retain pesticidal activity. 
As such, one of skill in the art could identify the pesticidal sequences encompassed by the 
present claims without undue experimentation. 

The Examiner has also cited Guo et al (2004) Proc. Natl Acad. Sci. USA 101:9205-9210 
for the proposition that increasing the number of amino acid substitutions in a protein increases 
the probability that the protein will be functionally inactivated. The Examiner uses this reference 
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as evidence that making and analyzing delta-endotoxins that have multiple amino acid 
substitutions but that still retain pesticidal activity will require undue experimentation. The 
Examiner, however, has mischaracterized the Guo et al reference. The cited reference is 
directed to analysis of the probability that a random amino acid replacement will lead to a 
protein's functional inactivation (emphasis added). In contrast, the specification provides a 
rational and systematic method for designing delta-endotoxin variants that retain pesticidal 
activity. One of skill in the art would appreciate that regions known to be important for 
pesticidal activity would be unlikely to tolerate significant mutation and, therefore, would not 
expect such mutations to result in a biologically active protein. Thus, the teachings of Guo et al 
do not support the Examiner's conclusion that the present claims lack enablement. 

The Examiner further relies on the teachings of de Maagd et al (1999) Appl Environ. 
Microbiol 65:4369-4374, Tounsi et al (2003) J. Appl Microbiol 95:23-28 and 
Angsuthanasombat et al (2001) J. Biochem. Mol Biol 34:402-407 in support of the assertion 
that amino acid substitutions in delta-endotoxin proteins are unpredictable. However, each of 
these references describes substitutions (which are largely non-conservative) in conserved 
regions, de Maagd et al teaches that the insertion of several groups of amino acids within 
Domain III of Cry IE with the corresponding amino acids of CrylC will alter the specificity 
and/or toxicity of Cry IE. Since the conserved Domain III is well known by those of skill in the 
art to be involved in specificity of a delta-endotoxin toward a pest, it would be no surprise that 
alteration of this domain could affect specificity of the protein. In fact, that was the intention of 
de Maagd et al Similarly, Tounsi et al discuss the single amino acid difference between 
Cry Hal and Crylla2 (which is a non-conservative substitution of aspartic acid for tyrosine at 
position 233) as being critical to insecticidal specificity of these two toxins. Again, this 
substitution occurs in the conserved Domain I. Finally, Angsuthanasombat et al teach a critical 
amino acid residue at position 136 where even a conservative substitution could lead to loss of 
pesticidal activity. Yet again, the authors specifically targeted amino acids in conserved Domain 
I in order to alter function. Since the instant specification clearly defines the conserved domains 
described by the aforementioned references with respect to the claimed sequences (see page 4), 
one of skill in the art would appreciate that substitutions made in these domains could lead to a 

14 of 21 

RTA01/2199106vl 



Appl.No.: 10/782,020 

Amdt. dated May 12, 2006 

Reply to Office Action of February 14, 2006 

loss of specificity and/or toxicity. Further, the references cited by the Examiner actually support 
the Applicant's assertion that one of skill in the art at the time of the invention would understand 
which residues could be altered to change the function of delta-endotoxins, implying that one of 
skill would equally understand which residues not to change when maintenance of function is 
desired. 

In establishing non-enablement, the burden rests initially with the Examiner to 
substantiate the unpredictability of the art and that, given the unpredictability, the specification 
does not provide sufficient information to guide those of skill to make and use the claimed 
invention across the full scope of the claims. In view of the discussion above, the references 
cited by the Examiner fails to support the position that claims 1-1 1, 19, 22, and 23 are not 
enabled. 

The Examiner also states that the specification fails to teach how to use a complement of 
nucleic acids encoding pesticidal protein with 95% identity to SEQ ID NO:3 or 5 or nucleic acids 
with 95% identity to SEQ ID NO: 1 , 2, or 4. The Applicant respectfully disagrees. Page 8, lines 
4-7 state that the complement of a claimed nucleotide sequence is one that would hybridize to a 
given nucleotide sequence to thereby form a stable duplex. Page 13, lines 19-26 state that 
hybridization methods (using, for example, complementary sequences) can be used to screen 
cDNA or genomic libraries for delta-endotoxin sequences having substantial identity to the 
sequences of the invention. Therefore, the specification clearly teaches how to use the 
complement of a nucleic acid sequence of the invention to, for example, screen for similar delta- 
endotoxin sequences. However, to expedite prosecution, claim 1 has been amended to delete the 
subject matter pertaining to complementary sequences. 

The Examiner further maintains that the specification does not enable the transformation 
of any plant with a nucleotide sequence with 95% identity to the nucleotide sequence of SEQ ID 
NO:l, 2, or 4, or a nucleotide sequence encoding SEQ ID NO:3 or 5 because undue trial and 
error experimentation would be required to screen for nucleotide sequences encompassed by the 
claims and plants transformed therewith to identify those plants with pesticidal activity. As 
discussed above, the amount of experimentation required to identify a nucleotide sequence that 
has 95% sequence identity to SEQ ID NO: 1, 2, or 4, or to a nucleotide sequence encoding SEQ 
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ID NO: 3 or 5 is not undue. With respect to transformation of plants with these sequences, the 
specification provides routine methods for transformation of plants with nucleotide sequences 
and the regeneration of transgenic plants. See pages 20-26 and Examples 14 and 15. Given the 
guidance provided in the specification and the knowledge in the art, the claims directed to 
transformation of a plant with a delta-endotoxin sequence, or variant or fragment thereof, are 
fully enabled. 

In light of the above arguments, the level of skill and knowledge in the art, and the 
guidance provided in the specification, Applicants respectfully submit that the specification is 
enabling for the full scope of claims 1-1 1, 19, 22 and 23. Thus, the rejection of the claims under 
35 U.S.C. § 1 12, first paragraph, for lack of enablement should be withdrawn. 

Written Description 

Claims 1-11, 19, 22 and 23 were further rejected under 35 U.S.C. § 1 12, first paragraph, 
as failing to satisfy the written description requirement. The rejection is respectfully traversed. 

The Examiner asserts that the disclosure is insufficient to support claims that are drawn to 
a genus of nucleic acids having 95% sequence identity to SEQ ID NO:l, 2, or 4, or nucleic acids 
encoding polypeptides having 95% identity to SEQ ID NO:3 or 5. In order to satisfy the written 
description requirement of 35 U.S.C. § 1 12, the application must reasonably convey to one 
skilled in the art that the applicant was in possession of the claimed subject matter at the time the 
application was filed. Vas-Cath Inc. v. Mahurkar, 935 F.2d 1555, 1563, 19 U.S.P.Q.2d (BNA) 
1111,1117 (Fed. Cir. 1991). Every species encompassed by the claimed invention, however, 
need not be disclosed in the specification to satisfy the written description requirement of 35 
U.S.C. § 1 12, first paragraph. Utter v. Hiraga, 845 F.2d 993, 6 USPQ2d 1709 (Fed. Cir. 1988). 
The Federal Circuit has made it clear that sufficient written description requires simply the 
knowledge and level of skill in the art to permit one of skill to immediately envision the product 
claimed from the disclosure. Purdue Pharm L.P. v. Paulding In, , 230 F.3d 1320 1323, 596 
USPQ2d 1481, 1483 (Fed. Cir. 2000) ("One skilled in the art must immediately discern the 
limitations at issue in the claims."). 
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Moreover, the "Guidelines for Examination of Patent Applications Under 35 U.S.C. 
§1 12, U 1, 'Written Description' Requirement" state that a genus may be described by "sufficient 
description of a representative number of species ... or by disclosure of relevant, identifying 
characteristics , i.e. structure or other physical and/or chemical properties." Id at 1 106. This is 
in accordance with the standard for written description set forth in Regents of the University of 
California v. Eli Lilly & Co, 1 19 F.3d 1559 (Fed. Cir. 1997), where the court held that "[a] 
written description of an invention involving a chemical genus, like a description of a chemical 
species, 'requires a precise definition, such as by structure, formula, or chemical name' of the 
claimed subject matter sufficient to distinguish it from other materials." 119 F.3d at 1568, citing 
Fiers v. Revel 984 F.2d 1 164 (Fed. Cir. 1993). In Enzo Biochem, Inc. v. Gen-Probe, Inc., 323 
F.2d 926 (Fed. Cir. 2002), the Federal Circuit adopted the PTO standard for written description, 
stating: 

[U]nder the Guidelines, the written description requirement would be met ... if 
the functional characteristics of [a genus of polypeptides] were coupled with a 
disclosed correlation between that function and a structure that is sufficiently 
known or disclosed. We are persuaded by the Guidelines on this point and adopt 
the PTO's applicable standard for determining compliance with the written 
description requirement." 

The claims of the present application meet the requirements for written description set 
forth by the Federal Circuit. The claims recite that the nucleic acid have 95% sequence identity 
to the nucleotide sequence of SEQ ID NO:l, 2, or 4, or to a nucleotide sequence encoding SEQ 
ID NO: 3 or 5. Methods for determining percent identity between any two sequences are known 
in the art and are provided in the specification. See pages 8-13. As discussed above, nucleotide 
sequences for full-length AXMI-004 (SEQ ID NO:l), as well as variants and fragments (e.g., 
SEQ ID NO:2 and 4) are disclosed in the specification. Numerous delta-endotoxin sequences 
were also generally known in the art at the time the application was filed. Moreover, detailed 
information regarding the structure of delta-endotoxins and the reported functions associated 
with particular structures, regions, and motifs was also available in the prior art as well as 
discussed in detail on page 2, lines 22-29, Figure legend 1, and on pages 12-13. At the time of 
filing, it was known that delta-endotoxins generally comprise three domains, a seven-helix 
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bundle that is involved in pore formation, a three-sheet domain that has been implicated in pore 
formation, and a beta-sandwich motif. See Lie/ al (1991) Nature 305:815-821. Thus, the 
recitation of polypeptides having a particular percent identity to a delta-endotoxin provides very 
specific and defined structural parameters of the sequences that can be used in the invention. 
These structural limitations are sufficient to distinguish the nucleotide and amino acid sequences 
of the invention from other nucleic acids and polypeptides and thus sufficiently define the genus 
of sequences useful in the practice of the present invention. 

The Examiner is reminded that the description of a representative number of species does 
not require the description to be of such specificity that it would provide individual support for 
each species that the genus embraces. 66 Fed. Reg. 1099, 1 106 (2000). Satisfactory disclosure 
of a "representative number" depends on whether one of skill in the art would recognize that the 
applicant was in possession of the necessary common attributes or features of the elements 
possessed by the members of the genus in view of the species disclosed. 66 Fed. Reg. 1099, 
1 106 (2000). Here, Applicants have provided nucleotide and amino acid sequences for 
exemplary pesticidal sequences and variants and fragments thereof encompassed by the claims. 
Moreover, numerous delta-endotoxin sequences were known and readily available in the art. 
Therefore, Applicants submit that in view of the present disclosure and the knowledge and level 
of skill in the art the skilled artisan would envision the claimed invention. 

The description of a claimed genus can be by structure, formula, chemical name, or 
physical properties. See Ex parte Maizel, 27 USPQ2d 1662, 1669 (B.P.A.I. 1992), citing Amgen 
v. ChugaU 927 F.2d 1200, 1206 (Fed. Cir. 1991). A genus of polypeptides may therefore be 
described by means of a recitation of a representative number of amino acid sequences that fall 
within the scope of the genus, or by means of a recitation of structural features common to the 
genus, which features constitute a substantial portion of the genus. See Regents of the University 
of California v. Eli Lilly & Co., 1 19 F.3d 1559, 1569 (Fed. Cir. 1997); see also Guidelines for 
Examination of Patent Applications Under the 35 U.S.C. 1 12, first paragraph, "Written 
Description" Requirement, 66 Fed. Reg. 1099, 1 106 (2000). The recitation of a predictable 
structure (i.e., an amino acid sequence having a specified percent identity or number of 
contiguous amino acid residues of a particular sequence) is sufficient to satisfy the written 
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description requirement. Thus, the application provides the structural features that characterize 
sequences having at least 95% sequence identity to SEQ ID NO:l, 2, or 4, or to a nucleotide 
sequence encoding SEQ ID NO: 3 or 5 that retain pesticidal activity. 

An Applicant may also rely upon functional characteristics in the description, provided 
there is a correlation between the function and structure of the sequences recited in the claims. 
Id., citing Lilly at 1568. The present claims further recite functional characteristics that 
distinguish the sequences of the claimed genus. Specifically, the claims recite that the sequences 
having at least 95% sequence identity to SEQ ID NO:l, 2, or 4, or to a nucleotide sequence 
encoding SEQ ID NO:3 or 5 encode proteins which have pesticidal activity. The specification 
and the art provide standard assays that may be used to measure pesticidal activity. See, for 
example, page 8, lines 27-31. Furthermore, as noted above, Applicants have disclosed fragment 
sequences that retain pesticidal activity (e.g., SEQ ID NO:4, which encodes a fragment of SEQ 
ID NO:3). Accordingly, both the structural and functional properties that characterize the genus 
of sequences that can be used to practice the invention are specifically recited in the claims. The 
sequences that fall within the scope of the claims can readily be identified by the methods set 
forth in the specification. 

In summary, the specification provides an adequate written description of the claimed 
invention. In particular, the specification provides: nucleotide and amino acid sequences for 
pesticidal toxins, and variants and fragments thereof, that fall within the scope of the claims; 
guidance regarding sequence alterations that do not disrupt pesticidal activity of a toxin; 
guidance for determining percent identity; and methods for assaying the pesticidal activity of 
proteins. In view of the above remarks and claim amendments, Applicants submit that the 
relevant identifying structural and functional properties of the genus of sequences of the present 
invention would be clearly recognized by one of skill in the art. Consequently, Applicants were 
in possession of the invention at the time the application was filed, and the rejection of the claims 
under 35 U.S.C. § 1 12, first paragraph, for lack of written description should be withdrawn. 

The Rejection of the Claims Under 35 U.S.C. § 1 12. Second Paragraph Should Be Withdrawn 
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Claims 3, 1 1, and 19, as well as dependent claims therefrom, were rejected under 35 
U.S.C. § 1 12, second paragraph as being indefinite for failing to particularly point out and 
distinctly claim the subject matter that Applicant regards as the invention. 

Claim 3 has been amended to recite "relative to the GC content of SEQ ID NO: 1, 2, or 
4." Support for this amendment can be found on page 24, line 30 through page 25, line 7. Claim 
1 1 has been amended to recite "the plant" of claim 9. Claim 9 depends depends from claim 8, 
which depends from claim 6, which depends from claim 4, which depends from claim 1 . 
Therefore, as amended, claim 1 1 now describes a transgenic seed derived from a plant that 
comprises a host cell that contains a vector comprising the nucleic acid of claim 1 . Claim 19 has 
been amended to recite "the nucleic acid molecule" such that claim 19 now encompasses a 
method for producing a polypeptide by culturing a host cell that contains a vector that comprises 
the nucleic acid of claim 1. Accordingly, the rejection of claims 3, 1 1, and 19 under 35 U.S.C. § 
1 12, second paragraph should be withdrawn. 

The Rejection of the Claims Under 35 U.S.C. § 102 Should Be Withdrawn 

Claims 22-23 were rejected under 35 U.S.C. § 102(b) as being anticipated by Barton et 
al (U.S. Patent No. 6,833,449). Barton et al teach tobacco plants transformed with a nucleic 
acid encoding a Cryl protein. The Examiner states that the recitation of "a" before "nucleotide 
sequence of SEQ ID NO: 1 , 2 or 4" in parts (a) and (d) and "an" before "amino acid sequence of 
SEQ ID NO:3 or 5" in part (c) encompasses nucleic acids that comprise the full-length sequence 
of SEQ ID NO: 1 , 2 or 4, or any portion of SEQ ID NO: 1 , 2 or 4 or that encode the full-length of 
SEQ ID NO:3 or 5 or any portion of SEQ ID NO:3 or 5. Claims 22 and 23 have been amended 
to recite "the" before "nucleotide sequence" and "amino acid sequence." As such, the Cryl 
protein taught in Barton et al does not comprise the sequence of SEQ ID NO: 1 , 2, or 4, nor a 
sequence with 95% identity to SEQ ID NO:l, 2 or 4. Accordingly, the rejection of claims 22 and 
23 under 35 U.S.C. § 102(b) should be withdrawn. 

It is not believed that extensions of time or fees for net addition of claims are required, 
beyond those that may otherwise be provided for in documents accompanying this paper. 
However, in the event that additional extensions of time are necessary to allow consideration of 
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this paper, such extensions are hereby petitioned under 37 CFR § 1.136(a), and any fee required 
therefore (including fees for net addition of claims) is hereby authorized to be charged to Deposit 
Account No. 16-0605. 



"Express Mail" mailing label number EV395779879US 
Date of Deposit May 12, 2006 
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Mail Stop Amendment, Commissioner for Patents, P.O. Box 1450, Alexandria, VA 22313-1450 
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BACKGROUND AND HISTORY OF PESTICIDAL 
CRYSTAL PROTEIN NOMENCLATURE 

Since the first cloning of an insecticidal crystal protein gene 
from Bacillus thuringiensis (91), many other such genes have 
been isolated. Initially, each newly characterized gene or pro- 
tein received an arbitrary designation from its discoverers: icp 
(64); cry (21, 121); kurhdl (31); Bta (88); btl, bt2, etc. (40); 
type B and type C (43); and 4.5 kb, 5.3 kb, and 6.6 kb (55). The 
first systematic attempt to organize the genetic nomenclature 
relied on the insecticidal activities of crystal proteins for the 
primary ranking of their corresponding genes (44). The cryl 
genes encoded proteins toxic to lepidopterans; cryll genes en- 
coded proteins toxic to both lepidopterans and dipterans; crylll 
genes encoded proteins toxic to coleopterans; and crylV genes 
encoded proteins toxic to dipterans alone. 

This system provided a useful framework for classifying the 
ever-expanding set of known genes. Inconsistencies existed in 
the original scheme, however, due to attempts to accommo- 
date genes that were highly homologous to known genes but 
did not encode a toxin with a similar insecticidal spectrum. The 
cryllB gene, for example, received a place in the lepidopteran- 
dipteran class with cryllA, even though toxicity against dipter- 
ans could not be demonstrated for the toxin designated 
CryllB. Other anomalies arose after the nomenclature was 
established. The protein named CrylC, for example, was re- 
ported to be toxic to both dipterans and lepidopterans (103), 
while the protein designated CrylB was reported to be toxic to 
both lepidopterans and coleopterans (8). Because the nomen- 
clature system provided no central committee or database to 
maintain standardization, new genes encoding a diverse set of 
proteins without a common insecticidal activity each received 
the name cryV, based on the next available Roman numeral 
(32, 46, 67, 100, 102, 108). 

PROPOSED NOMENCLATURE 

We propose in this review a revised nomenclature for the cry 
and cyt genes. To organize the wealth of data produced by 
genomic sequencing efforts, a new nomenclatural paradigm is 
emerging, exemplified by the internationally recognized cyto- 
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chrome P-450 superfamily nomenclature system (68a, 122a). 
Our proposal conforms closely to this model both in concep- 
tual basis and in nomenclature format. The underlying basis of 
this type of system is to assign names to members of gene 
superfamilies according to their degree of evolutionary diver- 
gence as estimated by phylogenetic tree algorithms. The no- 
menclature format in such a system is designed to convey rich 
informational content about these relationships by appending 
to the mnemonic root a series of numerals and letters assigned 
in a hierarchical fashion to indicate degrees of phylogenetic 
divergence. This change from a function-based to a sequence- 
based nomenclature allows closely related toxins to be ranked 
together and removes the necessity for researchers to bioassay 
each new protein against a growing series of organisms before 
assigning it a name. 

In our proposed revision, Roman numerals have been ex- 
changed for Arabic numerals in the primary rank (e.g., 
Cryl Aa) to better accommodate the large number of expected 
new proteins. The mnemonic Cyt to designate crystal proteins 
showing a general cytolytic activity in vitro has been retained 
because of its historical precedent and entrenchment in the 
research literature. Our definition of a Cry protein is rather 
broad: a parasporal inclusion (crystal) protein from B. thurin- 
giensis that exhibits some experimentally verifiable toxic effect 
to a target organism, or any protein that has obvious sequence 
similarity to a known Cry protein. Similarly, Cyt denotes a 
parasporal inclusion (crystal) protein from B. thuringiensis that 
exhibits hemolytic activity, or any protein that has obvious 
sequence similarity to a known Cyt protein. By these criteria, 
the nontoxic 40-kDa crystal protein from B. thuringiensis subsp. 
thompsoni, for example, has been excluded from our list, but 
the lepidopteran-active 34-kDa protein (now Cryl5A) en- 
coded by an adjacent gene has been included (11). 

The freely available software applications CLUSTAL W 
(110) and PHYLIP (27) define the sequence relationships 
among the toxins to form the framework of the new nomen- 
clature. In the first step, CLUSTAL W aligns the deduced 
amino acid sequences of the full-length toxins and produces a 
distance matrix, quantitating the sequence similarities among 
the set of toxins. CLUSTAL W default settings are employed, 
except that the "delay divergent sequences" setting in the mul- 
tiple-alignment parameter menu is reduced from 40 to 0%. 
The NEIGHBOR application within the PHYLIP package 
then constructs a phylogenetic tree from the distance matrix by 
an unweighted pair-group method using arithmetic averages 
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(UPGMA) algorithm. The TREEVIEW application (73), with 
the "phylogenetic tree" and "ladderize left" options selected, 
produces a graphic presentation of the resulting tree. 

We have applied this procedure to the set of holotype se- 
quences given in Table 1 to produce the phylogenetic tree 
presented in Fig. 1. Vertical lines drawn through the tree show 
the boundaries used to define the various nomenclatural ranks. 
The name given to any particular toxin depends on the location 
of the node where the toxin enters the tree relative to these 
boundaries. A new toxin that joins the tree to the left of the 
leftmost boundary will be assigned a new primary rank (an 
Arabic number). A toxin that enters the tree between the left 
and central boundaries will be assigned a new secondary rank 
(an uppercase letter). It will have the same primary rank as the 
other toxins within that cluster. A toxin that enters the tree 
between the central and right boundaries will be assigned a 
new tertiary rank (a lowercase letter). Finally, a toxin that joins 
the tree to the right of the rightmost boundary will be assigned 
a new quaternary rank (another Arabic number). Toxins with 
identical sequences but isolated independently will receive sep- 
arate quaternary ranks. 

By this method each toxin will be assigned a unique name 
incorporating all four ranks. A completely novel toxin would 
currently be assigned the name Cry23Aal. For the sake of 
convenience, however, we propose that the inclusion of the 
tertiary rank a and quaternary rank 1 be optional, their use 
dictated only by a need for clarity. This new toxin could there- 
fore simply be referred to as Cry23A. 

In choosing locations for rank boundaries, we attempted to 
construct a nomenclature reflecting significant evolutionary 
relationships while at the same time minimizing changes from 
the gene names assigned under the old system. In the resulting 
system, proteins with a common primary rank are similar 
enough that the percent identity can be defined with some 
confidence. Proteins with the same primary rank often affect 
the same order of insect; those with different secondary and 
tertiary ranks may have altered potency and targeting within an 
order. At the tertiary rank, differences can be due to the ac- 
cumulation of dispersed point mutations, but often they appear 
to have resulted from ancestral recombination events between 
genes differing at a lower rank level (9). The quaternary rank 
was established to group "alleles" of genes coding for known 
toxins that differ only slightly, either because of a few muta- 
tional changes or an imprecision in sequencing. To avoid con- 
fusion, however, the reader should bear in mind the differences 
between the quaternary rank number and the classical concept 
of the allele. Any cry gene specified with a quaternary rank is 
a natural isolate. No assumption about functionality is implied 
by the presence of this rank number in the gene name. In 
contrast, an allele number would be assumed, unless paren- 
thetical or subscripted information indicated otherwise, to de- 
note a nonfunctional mutant form of a wild-type gene found at 
a discrete genetic locus. Because of the somewhat modular 
nature of the Cry proteins and the effect that various segmental 
relationships could have on the clustering algorithm, it is likely 
that these boundaries will move slightly or even bend as the 
addition of new sequences changes the topology of the phylo- 
genetic tree. Currently the boundaries represent approxi- 
mately 95, 78, and 45% sequence identity. 

A B. thuringiensis Pesticidal Crystal Protein Nomenclature 
Committee, consisting of the authors of this paper, will remain 
as a standing committee of the Bacillus Genetic Stock Center 
(BGSC) to assist workers in the field of B. thuringiensis genetics 
in assigning names to new Cry and Cyt toxins. The correspond- 
ing gene or protein sequences must first be deposited into a 
publicly accessible database (GenBank, EMBL, or PIR) and 



released by the repository for electronic publication in the 
database so that the scientific community may conduct an 
independent analysis. Researchers should submit new se- 
quences directly to the BGSC director (D. R. Zeigler), either 
by electronic mail (zeigler.l@osu.edu) or on computer dis- 
kette. The director will analyze the amino acid sequence as 
described above and suggest the appropriate name, subject to 
the approval of the committee. The committee will periodically 
review the literature of the Cry and Cyt toxins and publish a 
comprehensive list. This list, alongside other relevant informa- 
tion, will also be available via the Internet at the following 
URL: http:/Avww.biols.susx.ac.uk/Home/Neil_Crickmore/Bt/. 

The current list of cry and cyt genes (including quaternary 
ranks) is given in Table 1. New gene names are listed with their 
previous names, their GenBank accession numbers, and pub- 
lished references. The quaternary ranks were assigned in the 
order that the gene sequences were discovered in the literature 
or submitted to the committee. Genes assigned the quaternary 
rank 1 represent holotype sequences. 

The boundaries shown in Fig. 1 allow most cry genes to 
retain the names they received under the system of Hofte and 
Whiteley (44), after a substitution of Arabic for Roman nu- 
merals. There are a few notable exceptions: crylG becomes 
cry9A, cryHIC becomes cry7Aa, cryHID becomes cry3C, crylVC 
becomes crylOA, crylVD becomes cryllA, cytA becomes cytlA, 
and cytB becomes cytlA (Table 1). Under the revised system, 
the known Cry and Cyt proteins fall into 24 sets at the primary 
rank— Cytl, Cyt2, and Cryl through Cry22. 

ROBUSTNESS OF THE NOMENCLATURE 

The robustness of the current naming process was assessed 
by a number of additional analyses. The choice of clustering 
algorithm (unweighted pair-group method using arithmetic av- 
erages) was driven largely by the consistent location of a root 
and constant branch lengths, resulting in a common vertical 
alignment of sequence names and essentially allowing a "ruler 
across the tree" approach to naming. It has the drawback of 
imposing a common evolutionary clock on the clustering pro- 
cess, an assumption that cannot be assured. The distance met- 
ric related to percent identity (essentially 1 minus the fraction 
of identical residues of the total compared without gaps) is the 
one most commonly found as the output of sequence compar- 
ison programs, including CLUSTAL W. For phylogenetic anal- 
ysis, a more usual distance metric relates to the number of 
substitutions per site to convert one sequence to the other 
(e.g., DayhofFs point accepted mutation [PAM]) and accounts 
for the possibility of multiple substitutions per site as the se- 
quences are more divergent. The latter method has the draw- 
back of being more computationally intensive, and, for very 
divergent sequences, requiring too large a value, resulting in 
numeric computation failures. They also differ in the way se- 
quences of unequal length are handled, with the percent iden- 
tity method typically ignoring excess sequence and the other 
methods assigning a penalty. This is particularly important for 
crystal proteins, since a number of them lack the C-terminal 
protoxin segments yet are quite related to some longer toxins 
in the N-terminal toxin segment; we feel that the stronger 
association of such relationships found by the percent identity 
method is preferred. 

To assess the effect of using the neighbor-joining method to 
generate an unrooted tree, CLUSTAL W routines were used 
to generate such a tree with 1,000 bootstraps of the sequence 
alignment we used for Fig, 1. When an appropriate outgroup 
was chosen, the resulting tree (not shown) resembled our Fig. 
1. The bootstrap values indicated that the tree thus generated 
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TABLE 1. Known cry and cyt gene sequences with revised nomenclature assignments 



Revised Original gene or Accession 
gene name protein name no. 



Coding 
region 0 



Reference 



Revised 
gene name 



Original gene or 
protein name 



Accession 
no. 



21 25-3990 > Reference 



crylAal 

crylAal 

crylAa3 

crylAa4 

crylAaS 

cryIAa6 

crylAbl 

crylAbl 

crylAb3 

crylAb4 

crylAbS 

crylAbd 

crylAb7 

crylAb8 

crylAb9 

crylAblO 

crylAcl 

crylAd 

crylAc3 

crylAc4 

crylAcS 

crylAc6 

crylAc7 

crylAc8 

crylAc9 

crylAclO 

crylAdl 

crylAel 

crylAfl 

crylBal 

crylBa2 

crylBbl 

crylBcl 

crylBdl 

crylCal 

crylCal 

crylCa3 

crylCa4 

crylCaS 

crylCa6 

crylCal 

crylCbl 

crylDal 

crylDbl 

crylEal 

crylEa2 

crylEa3 

crylEa4 

crylEbl 

crylFal 

crylFa2 

crylFbl 

crylGal 

crylGa2 

crylGbl 

crylHal 

crylHbl 

cryllal 

crylla2 

crylla3 

crylla4 

cryllaS 

cryllbl 

crylJal 

crylJbl 

crylKal 

cry2Aal 

cry2Aa2 

cry2Aa3 

cry2Abl 



cry LA (a 
cryIA(a 
crylA(a 
cryIA(a 
crylA(a 
crylA(a 
crylAib 
crylA(b 
crylA{b 
crylAib 
crylAip 
crylA{b 
crylA\b 
cryIA{b 
crylA(b 
crylA(b[ 
crylA{c 
crylA(c 
crylA(c 
cryIA(c 
crylA{c 
cryIA(c 
crylA(c 
crylA{c 
crylA(c 

cryIA(c 
crylA(e 
icp 
cryJB 

ET5 

cryIB(c) 

cryEl 

crylC 

cryJC 

crylC 

crylC 

crylC 

crylC 

crylC 

crylC(b) 

crylD 

prtB 

crylE 

crylE 

crylE 

crylE(b) 

crylF 

crylF 

prtD 

prtA 

crylM 

cryH2 

prtC 

cryV 

cryV 

cryV 

cryV 

cryV159 

cryV465 

ET4 

ET1 

cryllA 
cryllA 

cryllB 



M11250 

M10917 

D00348 

X13535 

D17518 

U43605 

M13898 

M12661 

M15271 

D00117 

X04698 

M37263 

X13233 

Ml 6463 

X54939 

A29125 

Ml 1068 

M35524 

X54159 

M73249 

M73248 

U43606 

U87793 

U87397 

U89872 

AJ002514 

M73250 

M65252 

U82003 

X06711 

X95704 

L32020 

Z46442 

U70726 

X07518 

XI 3620 

M73251 

A27642 

X96682 

X96683 

X96684 

M97880 

X54160 

Z22511 

X53985 

X56144 

M73252 

U94323 

M73253 

M63897 

M73254 

Z22512 

Z22510 

Y09326 

U70725 

Z22513 

U35780 

X62821 

M98544 

L36338 

L49391 

Y08920 

U07642 

L32019 

U31527 

U28801 

M31738 

M23723 

D86064 

M23724 



527-4054 
153->2955 
73-3600 
1-3528 
81-3608 

1->1860 
142-3606 

155- 3622 

156- 3620 
163-3627 
141-3605 

73-3537 
1-3465 

157- 3621 
73-3537 

b 

388-3921 
239-3769 
339->2192 
1-3534 
1-3531 
1->1821 
976-4509 
153-3686 
388-3921 
388-3921 
1-3537 
81-3623 
172->2905 

1-3684 
186-3869 
67-3753 
141-3839 

47-3613 
241->2711 

1-3570 
234-3800 
l->2268 
l->2268 
l->2268 
296-3823 
264-3758 
241-3720 
130-3642 
1-3513 
1-3513 
388-3900 
1-3522 
478-3999 
1-3525 
483-4004 
67-3564 
692^210 

530-4045 
728-4195 
355-2511 
1-2157 
279-2435 

61-2217 
524-2680 
237-2393 

99-3519 
177-3686 
451-4098 
156-2054 
1840-3738 
2007-3911 
1-1899 



92 

98 

99 

62 

113 

63 

119 

111 

31 

50 

40 

37 

36 

69 

13 

28 

3 

117 

18 

84 

83 

63 

38 

71 

33 

107 

79 

60 

49 

10 

105 

25 

6 

12 

45 

88 

79 

114 

106 

106 

106 

48 

42 

56 

115 

7 

82 

47 

81 

14 

80 

56 

56 

96 

12 

56 

53 

108 

34 

100 

54 

94 

100 

25 

116 

52 

20 

123 

89 

123 



cry2Ab2 


cryllB 


X55416 


874-2775 


17 


cry2Acl 


cryllC 


X57252 


2125-3990 


124 


cry3AaI 


crylllA 


M22472 


25-1956 


39 


cry3Aa2 


crylllA 


J02978 


241-2172 


93 


cry3Aa3 


crylllA 


Y00420 


566-2497 


41 


cry3Aa4 


crylllA 


M30503 


201-2132 


65 


cry3Aa5 


crylllA 


M37207 


569-2500 


22 


cry3Aa6 


crylllA 


U 10985 


569-2500 


1 


cry3Bal 


crylllB2 


X17123 


25->l977 


101 


cry3Ba2 


crylllB 


A07234 


342-2297 


85 


cry3Bbl 


cryllIBb 


M89794 


202-2157 


24 


cry3Bb2 


crylllCQ)) 


U31633 


144-2099 


23 


cry3Cal 


crylllD 


X59797 


232-2178 


59 


cry4Aal 


crylVA 


Y00423 


1-3540 


121 


cry4Aa2 


crylVA 


D00248 


393-3935 


95 


cry4Bal 


crylVB 


X07423 


157-3564 


16 


cry4Ba2 


crylVB 


X07082 


151-3558 


112 


cry4Ba3 


crylVB 


M20242 


526-3930 


125 


cry4Ba4 


crylVB 


D00247 


461-3865 


95 


crySAal 


cryVA(a) 


L07025 


1->4155 


102 


crySAbl 


cryVA(b) 


L07026 


l->3867 


67 


crySAcl 




134543 


l->3660 


76 


crySBal 


PS86Q3 


U 19725 


l->3735 


76 


cry6Aal 


cryVIA 


L07022 


1->1425 


68 


cry6Bal 


cryVlB 


L07024 


1->1185 


67 


cry7Aal 


crylllC 


M64478 


184-3597 


58 


cry7Abl 


crylllCib) 


U04367 


1->3414 


75 


cry7Ab2 


crylllC{c) 


U04368 


1->3414 


75 


crySAal 


crylHE 


U04364 


1->3471 


29 


crySBal 


crylllG 


U04365 


l->3507 


66 


crySCal 


crylllF 


U04366 


1-3447 


70 


cry9Aal 


crylG 


X58120 


5807-9274 


104 


cry9Aa2 


crylG 


X58534 


385->3837 


32 


cry9Bal 


cryX 


X75019 


26-3488 


97 


cry9Cal 


crylH 


Z37527 


2096-5569 


57 


cry9Dal 


N141 


D85560 


47-3553 


4 


cry9Da2 




AF042733 


<1->1937 


122 


crylOAal 


crylVC 


Ml 2662 


941-2965 


111 


cryllAal 


crylVD 


M31737 


41-1969 


21 


cryllAa2 


crylVD 


M22860 


< 1-235 


2 


cryllBal 


Jeg80 


X86902 


64-2238 


19 


cryllBbl 


94 kDa 


AF017416 




72 


crylTAal 


cryVB 


L07027 


1->3771 


67 


cry!3Aal 


cryVC 


L07023 


1-2409 


90 


cry!4Aal 


cryVD 


U 13955 


1-3558 


77 


cry!5Aal 


34kDa 


M76442 


1036-2055 


11 


cry!6Aal 


cbm7l 


X94146 


158-1996 


5 


cryl7Aal 


cbm72 


X99478 


12-1865 


5 


crylSAal 


cryBPl 


X99049 


743-2860 


126 


cry!9Aal 


Jeg65 


Y07603 


719-2662 


86 


cryl9Bal 




D88381 




87 


cry20Aal 


86kDa 


U82518 


60-2318 


61 


cry21Aal 




132932 


1 ICOI 

l-JoUl 


/4 


cry22Aal 




I34547 


1-2169 


76 


cytlAal 


cytA 


X03182 


140-886 


118 


cytlAa2 


cytA 


X04338 


509-1255 


120 


cytlAa3 


cytA 


Y00135 


36-782 


26 


cytlAa4 


cytA 


M35968 


67-813 


30 


cytlAbl 


cytM 


X98793 


28-777 


109 


cytlBal 




U37196 


1-795 


78 


cyt2Aal 


cytB 


Z14147 


270-1046 


51 


cyt2Bal 


"cytB" 


U52043 


287-655 


35 


cyt2Bbl 




U82519 


416-1204 


15 



• The symbols < and > indicate that the coding region extends up- or downstream, respectively, from the known sequence data. 
b Only the polypeptide sequence has been reported. 
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Primary Rank 



Secondary Rank Tertiary Rank 

si " 3 



Main Cry 
Lineage 




cyt 

Lineage 

Outlying 

Cry 

Lineages 

I I 
10 20 



I 

30 



40 



I 

50 



I 

60 



70 



I 

80 



I 

90 



Percent Amino Acid Sequence Identity 



CrylAb 
CrylAe 
CrylAf 
CrylAa 
CrylAd 
CrylAc 
Cry 1 Fa 
CrylFb 
CrylGa 
CrylGb 
Cry 1 Da 

I CrylDb 

| CrylHa 

I CrylHb 

I CrylEa 

CrylEb 
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FIG. I. Phylogram demonstrating amino acid sequence identity among Cry and Cyt proteins. This phylogenetic tree is modified from a TREEVIEW visualization 
of NEIGHBOR treatment of a CLUSTAL W multiple alignment and distance matrix of the full-length toxin sequences, as described in the text. The gray vertical bars 
demarcate the four levels of nomenclature ranks. Based on the low percentage of identical residues and the absence of any conserved sequence blocks in 
multiple-sequence alignments, the lower four lineages are not treated as part of the main toxin family, and their nodes have been replaced with dashed horizontal lines 
in this figure. 



had significant branch points deeper in the tree than the cho- 
sen primary rank in the nomenclature. This sort of analysis was 
rejected as unsuitable for the purposes of Cry nomenclature 
due to the generally ragged branch lengths it produced and the 
requirement for the careful choice of an outgroup. 
An alternative method of clustering protein sequences, ca- 



pable of handling sequences that are quite diverse, is parsi- 
mony analysis. A consensus tree generated from 100 boot- 
straps of such an analysis displaces the two incomplete Cryl 
sequences (CrylBd and CrylAf) and the two Cryl sequences 
lacking the C-terminal protoxin segments (Cryl la and Cry lib) 
into a region of the tree populated with such shortened se- 
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quences (not shown). With the further exceptions of Cryl2A 
being interjected into the Cry5 cluster and a number of se- 
quences besides Cry6B clustering higher in the tree than 
Cry6A, the proposed nomenclature successfully reflects the 
grouping of sequences provided by this method of analysis as 
well. 

As noted above, the usual distance metrics for phylogenetic 
analysis account for multiple substitutions per site; most com- 
monly, the Dayhoff PAM metric is used. When this distance 
metric was applied to the alignment used to make Fig. 1, a 
large number of the sequence pairs were found to have infinite 
distance. Therefore, the main Cry lineage and the Cyt lineage 
were separately aligned, the distances were calculated, and the 
distance matrices were clustered by using the FITCH program 
(of the PHYLIP software package). This method of analysis 
revealed several strongly associated groups of sequences 
(>90% of trees) in the main Cry lineage that extend deeper 
into the tree than the primary rank assigned in the proposed 
nomenclature: Cryl; Cry3; Cry4; Cry7; the Cry5, Cryl2-Cryl3- 
Cryl4-Cry21 group; the Cry8-Cry9 group; the Cryl0-Cryl9 
group; the Cryl6-Cryl7 group; and the Cry2-Cryll-Cryl8 
group. Many of these groups, however, were separated by 
branch points that were either nonmajority or were found 
<60% of the time; thus, the arrangement of these groups 
would be likely to change with additional sequence additions. 
At the secondary rank, the only anomaly with respect to the 
proposed nomenclature was the interjection of the Crylla and 
Cryllb sequences into the CrylB group. This effect may be due 
to an artificially reduced distance between the Cryll sequences 
and the incomplete CrylBd sequence caused by the particular 
distance metric used. The Cyt lineage sequences were sepa- 
rated into the expected two primary rank groups that separate 
into the expected secondary rank groupings. This more stan- 
dard phylogenetic approach also suffers from an accentuated 
visual disorientation of uneven branch lengths and shortening 
of the more closely related branches, especially at the tertiary 
rank (lowercase letter), where a great deal of comparative 
work has been done among the Cryl toxins. 

In summary, the proposed nomenclature uses readily avail- 
able software that can be easily interpreted by investigators in 
the field and meets their needs as well as, or better than, 
alternative methods of analysis and presentation. When the 
holotype toxins were analyzed by alternative phylogenetic 
methods, the hierarchy implied by the nomenclature was es- 
sentially consistent with the resulting phylogenetic clustering, 
and the few exceptions were largely explainable by known 
properties of the sequences in question. 
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The structure of the 5-endotoxin from Bacillus 
thuringiensis subsp. tenebrionis that is specifically 
toxic to Coleopterajnsects (beetle toxin) has been 
determined at 2.5 A resolution. It comprises three 
domains which are, from the N- to C-termini, a 
seven-helix bundle, a three-sheet domain, and a p 
sandwich. The core of the molecule encompassing 
all the domain interfaces is built from conserved 
sequence segments of the active 5 -endotoxins. 
Therefore the structure represents the general fold 
of this family of insecticidal proteins. The bundle 
of long, hydrophobic and amphipathic helices is 
equipped for pore formation in the insect mem- 
brane, and regions of the three-sheet domain are 
probably responsible for receptor binding. 



The 5-endotoxins are a family of insecticidal proteins produced 
by Bacillus thuringiensis (B.t.) during sporulation, having relative 
molecular masses (M T ) 60,000-70,000 (60K-70K) in the active 
form and specific toxicities against insects in the orders of 
Lepidoptera, Diptera and Coleoptera 1,2 . These toxins have been 
formulated into commercial insecticides for three decades 3 , and 
now insect-resistant plants are engineered by transformation 
with Lepidoptera-specific toxin genes 4 " 6 . In the bacterium 8- 
endotoxins are synthesized as protoxins of JVf r s 70K-135K and 
crystallize as a parasporal inclusion ~1 /x in size, in which form 
they are ingested by the susceptible insect. The microcrystal 
dissolves in the alkaline pH of the midgut and the protoxin is 
cleaved by gut proteases to release the active toxin. 5-Endotoxins 
activated in vitro bind specifically and with high affinity (k D ^ 
0.1-20 nM) to protein receptors on brush-border membrane 
vesicles derived from the gut epithelium of target insects 7 " 9 and 
create leakage channels of 10-20 A diameter in the cell mem- 
brane 10 . In vivo such membrane lesions lead to swelling and 
lysis of the gut epithelium 11 and death of the insect ensues 
through starvation and septicaemia. Active 5-endotoxins of 
different specificities show five strongly conserved regions in 
their amino-acid sequences 1,12 . Exchanging sequence segments 
in the divergent regions between toxins of different specificities 
can produce active hybrids showing altered target 
specificity 13 " 15 . We have determined the atomic structure of a 
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Coleoptera-specific S-endotoxin (CrylllA, beetle toxin) from 
B.t. subsp. tenebrionis 16 ' 1 * to elucidate the structural basis for 
target specificity and membrane perforation by this family of 
proteins. 

Structure determination 

Parasporal crystals of the beetle toxin contain the full-length 
644-residue protoxin 17 as the minor component, and a product 
of bacterial processing with 57 residues removed from the N- 
terminus as the major component 19 . The latter (Af r 67K) is 
similar in sequence to the active form of other 6-endotoxins. 
After solubilization, papain cleavage converts the mixture to the 
67K toxin (see legend to Table 1). This was recrystallized in the 
original crystal form of the parasporal crystals, space group 
C222, and cell dimensions 1 17.1 by 134.2 by 104.5 A, containing 
one molecule per asymmetric unit and 55% solvent by volume 18 . 

Initial evaluation of derivatives was carried out at 4.5 A resol- 
ution with data collected on the FAST TV diffractometer 20 using 
CuKa radiation. Complete datasets (Table 1) were then collec- 
ted to 2.5 A resolution from native crystals using the imaging 
plate systems at the EMBL outstation at DESY and from the 
mercury and platinum derivatives on film at SRS Daresbury. 
The electron density map (Fig. 1) at 2.5 A resolution calculated 
with phases from multiple isomorphous replacement (mean 
figure of merit, 0.63) was easily interpretable and was improved 
by solvent flattening 21,22 . A continuous polypeptide chain from 
residue 61 to residue 644 at the C terminus was traced unam- 
biguously, and most side-chain atoms could be located in the 
map. The atomic model was built using the graphics program 
O (ref. 23) and had an initial K-factor of 37% for all data to 
2.5 A. After preliminary refinement using the program X-PLOR 
(ref. 24), the current model, containing 584 amino acid residues 
and 40 bound water molecules, has an K-factor of 19.9% and 
r.m.s. bond length deviation of 0.017 A. 

Description of the structure 

Overview. The beetle toxin is a wedge-shaped molecule with a 
radius of gyration of 58 A. As shown in Fig. 2a, it comprises 
three domains. Domain I, from the N terminus of the 67K toxin 
to residue 290, is a seven-helix bundle in which a central helix 
is completely surrounded by six outer helices tilted at about 
+20° to it (Fig. 3b,c). Domain II, from residues 291 to 500, 
contains three antiparallel /3 sheets packed around a hydro- 
phobic core with a triangular cross-section (Fig. 4). Domain III, 
from residues 501 to 644 at the C terminus is a sandwich of two 
antiparallel f$ sheets (Fig. 5). Domains I and III make up the 
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TABLE 1 Data collection and phasing statistics 



Data collection 

Data 

Native 

CH 3 HgN0 3 

Hg(CH 3 C00) 2 

c/s-Pt(NH 3 ) 2 CI 2 

K 2 0s0 4 

HoCI 3 

Phasing statistics 

Derivative 

CH 3 HgN0 3 

Hg(CH 3 C00) 2 

c/s-Pt(NH 3 ) 2 CI 2 

K 2 0s0 4 

H0CI3 



Method of 




Number of 


Resolution 


Number of 


Unique reflections 




collection 




crystals 


(A) 


measurements 


(% completeness) 


^morge 


image plate 




8 


2.5 


121,767 


27.727 (100) 


0.108 


film 




7 


2.5 


luJ.bzo 


Zl.iOf \XkaJ) 




film 




5 


2.5 


60,224 


25.919 (94.5) 


0.103 


film 




7 


2.5 


86.629 


25.924 (94.5) 


0.107 


FAST 




1 


4.5 


21.143 


4.680 (100) 


0.077 


FAST 




1 


4.5 


20.013 


4.701 (100) 


0069 














Phasing power § 


Anomalous data Number of sites 




^Cuilis^ 


(resolution, A) 




no 




3 


0.183 


0715 


1.56 (2.5) 




yes 




6 


0.247 


0.609 


2.28 (2.5) 




no 




5 


0.185 


0.682 


1.54 (2.5) 




no 




4 


0.149 


0.757 


1.26 (5.5) 




no 




3 


0.095 


0.741 


1.35 (5.0) . 



Protein preparation: Solubilized parasporal crystals from B.t subsp tenebrionis were incubated at 0.5mgml" a protein with 0.125 units per ml of 
Agarose-linked papain (Boehringer) in 3.3 M NaBr. 0.05 M sodium phosphate. pH 7.0. and 0.1 mgml 1 phenylmethylsulphonylfluoride (PMSF) for 30min at 
20 °C. Digestion was stopped by adding tosyl lysinechloromethylketone (TLCK) to 0.125 mgm 1 and Na 2 C0 3 to one fifth volume and removing the 
enzyme-beads. The 67K beetle toxin was then purified by gel filtration on Sephadex G75 equilibrated with 0.1 M NaHC0 3 . pH 10.5. 0.5 M NaBr. Crystallization: 
Single crystals were obtained by microdialysis at a protein concentration of 2.5 mg ml" 1 against 0.1 M NaHC0 3 . pH 9.5. 1.2 M NaBr at 4 °C overnight, then 
against 0.1 M NaHC0 3 . pH 9.2, 0.5 M NaBr at 16 °C; 3 mM NaN 3 , 0.1 mM PMSF and 0.1 mg ml" 1 TLCK were present in all buffers. Crystals were transferred 
by stages to 0.05 M 2-(A/-morpholino)ethanesulphonic acid (MES), pH6.5. for derivative preparation and mounted in 0.03% low-melting agarose in this buffer 
during data collection. Data collection: Image plate and film data were processed using MOSFLM (Imperial College. London) and CCP4 programs (Daresbury. 
UK). FAST (ref . 20) data were collected and processed with MADNES 45 , and scaled in 3° batches. Derivatives: Crystals were soaked respectively in 0.25 mM 
CH 3 HgN0 3 for 3.5 h. in 1 mM Hg(CH 3 COO) 2 for 14 h. in freshly prepared 1 mM c/s-Pt(NH 3 ) 2 CI 2 for 21 h. in saturated K 2 0s0 4 for 35 h, and in 2 mM HoCI 3 for 
3 days. Phase calculation: Two heavy-atom sites in each derivative were located from difference Patterson functions, except in the case of Hg(CH 3 C00) 2 
for which 3 sites were located, and the remaining sites were found by cross-phased difference Fouriers. Heavy-atom parameters were refined against 
centric data and phases calculated for all data using the program PHARE (G. Bricogne). The two low-resolution derivatives were refined against phases 
calculated from the high-resolution derivatives. Phasing with the three high-resolution derivatives gave an overall figure of merit of 0.61 (25-2.5 A) and a 
clearly interpretable map. Including the remaining derivatives slightly improved the connectivity of the map (overall figure of merit 0.63). and four cycles of 
solvent flattening using a 50% solvent content and a 9 A radius in mask calculation 21 27 improved the overall definition of densities. The starting model 
was built using the program 0 (ref. 23) with the Bones option for main-chain tracing and the autobuild and manip options for side chains. Refinement by 
simulated annealing using the program X-PLOR (ref. 24) reduced the fl-factor from 0.37 to 0.25 without individual B-factors, and to 0.23 with restrained 
individual 0-factors. The model was adjusted in the loops 154-156. 429-436, and 483-488. and had 40 solvent molecules added, then refined by X-PLOR 
again. The current model has an ff-factor of 19.9%. with r.m.s. bond length deviation of 0.017 A. r.m.s. bond angle deviation of 3.2°. and average atomic 
B-factor of 18 A 2 . 

* Emerge = E L l'/-<Ok£ K')|. where /, are intensity measurements for a reflection, and </) is the mean intensity for this reflection. 
f flderiv=I IfpH-^pl'X where F P „ is the structure factor amplitude of the derivative crystal and F P is that of the native. 

* flcuiiis = 1 ||F PH ± F p \ - F H (calc)|/X {F^ - F P \. where F p and F PH are defined as for . and f„(calc) is the calculated heavy-atom structure factor amplitude 
summed over centric data only. 

§ Phasing power = <F H )/E the r.m.s. heavy-atom structure factor amplitudes divided by the residual lack of closure error. 



bulky end of the molecule. Through their contact one of the 
two 0 sheets in domain III is almost entirely buried. To our 
knowledge (see, for example, ref. 25), the packing of helices in 
domain 1 and of sheets in domain II are both novel arrange- 
ments. 

Domain I. The central helix in this seven-helix bundle is a 5 (Fig. 
36,c), which is oriented with its C terminus towards the bulky 
end of the molecule. Viewed from this end, the outer helices 
are arranged anticlockwise in the order of a u a 2 , a 3 , a 4 , a b 
and a 7 , with helices er, and a 7 adjacent to the /3-sheet domains; 
a 2 is interrupted by a non-helical section and only the leading 
half, a 2a , is packed against a 5 . Figure 3a shows the alignment 
of amino-acid sequence on the surfaces of the helices. The 
helices are long, especially a 3 to a 7 , which contain respectively 
8, 7, 6, 9 and 7 complete helical turns and hence would be long 
enough to span the 30-A thick hydrophobic region of a mem- 
brane bilayer. Furthermore, the six outer helices bear a strip of 
hydrophobic residues (denned by AG 5=0 for transfer from oil 
to water) down their entire length on the side-facing helix a 5 , 
so they are amphipathic. In keeping with the general observation 
that secondary structures are close-packed and bury hydro- 
phobic surfaces 26 , the helix contact angles in this domain cluster 
around +20° rather than -50°, giving the bundle a bouquet-like 
appearance (Fig. 3b). Figure 3c shows the bundle in cross- 
section. The interhelical space contains 27 aromatic residues 
which are packed in the edge-to-face fashion 27 ; all polar groups 
in this region are hydrogen-bonded or in salt bridges. 
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The concentric arrangement of the seven-helix bundle is dis- 
tinct from the two-layered type seen in bacteriorhodopsin. There 
is some resemblance to the pore-forming domain of colicin A 28 , 
in which two hydrophobic helices are shielded from solvent by 
eight amphiphilic helices, but the colicin helices are generally 
shorter. Like the colicin helices, the bundle in the beetle toxin 
may be a soluble form of packaging for the hydrophobic and 
amphiphilic helices that will form pores in the membrane after 
a large change in conformation. 

Domain II. In Fig. 4a and Ab the three sheets of this domain are 
laid side-by-side, as they would be seen from the solvent. There 
is an apparent structural duplication between the four-stranded 
antiparallel sheets, sheet 1 and sheet 2. The chain connections, 
04, 03, 02» 05 and 0 8 , 0 7 , 0 6 , 0 9 , respectively, follow the order 
of +3, - 1 , - 1 , +3, which is typical of the 'Greek-key' topology 29 . 
From both sheets the inner strands, 0 3 and 0 2 as well as 0 7 and 
0 6 , extend some 20 A to the apex of the molecule as two- 
stranded 0 ribbons; and at the point of departure from the 
sheets there is a 0 -bulge in 0 3 and in 0 7 to twist the plane of 
the ribbon by nearly 90° relative to the sheet. The connections 
between the outer strands cross over the ribbons on the solvent 
side. 

The pseudo-symmetry between these sheets is very approxi- 
mate. Using the least squares option in O (ref. 23), the sheet 
region of the strands 0 3 and 0 2 can be brought to superimpose 
on that of 0 7 and 0 6 , with a r.m.s. fit of 0.72 A for 13 a carbons. 
But the r.m.s. fit increased to 1.1 A for 23 a carbons of the 
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FIG. 1 Electron density map in the neighbourhood 
of Cys 243, calculated a. using combined phases 46 
from multiple isomorphous replacement and sol- 
vent flattening, and b. using combined experi- 
mental and model phases 46 after refinement by 
X-PLOR. The refined structure is shown superim- 
posed for reference. Although Cys 243 is a major 
site of both the methylmercury (NM) and mercuric 
acetate (MA) derivatives, the methyl mercury site 
is in a hydrophobic enviironment compared with 
the mercuric acetate site. 





whole inner strands including the ribbon region, and 1.7 A for 
36 a carbons on all four strands. Nonetheless, the sequence 
alignment brought by this superposition of the two sheets 
revealed a low level of internal homology, with seven pairs of 
equivalent residues (shown in bold) out of 41 aligned a carbons: 

338 HRIQFHTRFQP(6)SFNYWS{1)NYVSTBPSI(0)GSHDIITSPF(10)NLKPN 395 
402 AVAHTNUVWP(0)SAVYSG(l)TKVBPSQYll(3)DEASTQnrDS(7 )SWDSI 453 

The three-stranded sheet 3 is formed by two separate polypep- 
tide segments. The C-terminal segment of domain II contributes 
the two-stranded ribbon of 0, o and /3 U , whereas the N-terminal 
segment of this domain contributes strand 0, , which is hydro- 
gen-bonded to 0,,; /J, is followed by a two-turn helix a 8 and 
an extended chain. 

Figure 4c and d shows in side view and in cross-section that 
the three antiparallel sheets are packed around a triangular 
hydrophobic core. This brings the strand /3 I0 on the edge of 
sheet 3 into proximity with strand p 4 on the edge of sheet 1, as 
well as placing the loops at the end of the three p ribbons into 
a region of about 12 A radius at the molecular apex. This domain 
is in contact with helix a n of domain I on the face of sheet 3 
(Fig. 4c). 

Domain III. Figure 5 is a ribbon drawing of the strands forming 
the two sheets of the f$ sandwich. The sheet containing the 
C-termtnal strand is in contact with domain I and will be called 
the inner sheet. This domain has the 'jelly-roll' topology 29 , 
because it can be generated by folding an antiparallel p ribbon 
which starts with /3 I3 (N terminus) and p& (C terminus) on the 
inner sheet, and ends in the loop between p x8 and p l9 on the 
outer sheet; p X4 is a short excursion from this ribbon and forms 
the fifth antiparallel strand of the outer sheet. In addition, small 
parallel sheets are formed at the edge of the p sandwich through 
hydrogen bonding of strand p l2 to p x6 at the edge of the outer 
sheet, and p x to /3 l3 at the edge of the inner sheet. 
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Distribution of conserved sequences. The core of the beetle toxin 
molecule encompassing the domain interfaces is built from the 
five sequence blocks that are highly conserved throughout the 
5-endotoxin family 1 (Fig. 26,c). Block 1, located in the beetle 
toxin sequence at residues 189-218, corresponds to the central 
helix (a s ) of the bundle in domain I. Block 2, residues 239-305, 
overlaps with the latter half of a 6 , and with a 7 and p x ; the 
latter hydrogen-bonds to the edge of the inner sheet in domain 
III before forming part of the three-stranded sheet 3 in domain 
II. Block 3, residues 491-538, overlaps with the latter part of 
0n, where it is hydrogen-bonded to p lt and with the loops 
connecting domains II and III. The remainder of block 3 
together with blocks 4 and 5, namely residues 560-569 and 633 
to the C terminus, respectively, constitute the three buried 
strands of the inner antiparallel sheet in domain III. The high 
degree of conservation of internal residues implies that 
homologous proteins would adopt a similar fold. Using the 
beetle toxin structure as a model, we can therefore propose a 
basis for the insecticidal activity of 6-endotoxins as a family. 

Basis of insecticidal function 

Solubility. The beetle toxin crystals are isomorphous with the 
parasporal crystals 18,19 and show the molecular contacts respon- 
sible for solubility behaviour in vivo. Four intermolecular salt 
bridges, Asp 142-Arg 165, Asp 224- Arg 562, Asp 590- Arg 178, 
and Glu 223- Lys 293, are located at contacts to three different 
neighbouring molecules. Such salt bridges keep the protoxin 
crystals insoluble until exposed to the extreme pHs in the insect 
midgut. 

Proteolytic activation. Pro-5-endotoxins have M r s of either 
— 130K or ~*70K. Activation by larval gut proteases removes 
the C-terminal half of the larger protoxins 30 * 31 and cleaves them 
at residue 28 or 29 from the N terminus. The smaller protoxins, 
such as that of the beetle toxin, are processed only at the N 
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FIG. 2 Overview, a, Schematic ribbon 
representation of the beetle toxin 
showing the domain organization. 
Secondary structure assignments are 
given by Yasspa within program 0 (ret. 
23). The polypeptide pathway is indi- 
cated by colouring the chain in the rain- 
bow order, from red at the N terminus 
to blue at the C terminus. The three 
domains are: I, a seven-helix bundle 
(upper left); II, a three-sheet assembly 
(bottom); and III, a p sandwich (upper 
right). This and all following illustrations 
of the structure are made with the 
program MOLSCRIPT 47 . 6 and c, Car 
trace (stereoview) of the molecule with 
the five conserved sequence blocks 
indicated by small beads at their Ca 
positions. In b the view is as in a, and 
in c it Is down the central helix of the 
bundle from the bulky end of the 
molecule; c shows that the central helix 
of domain I and the inner sheet of 
domain III are conserved; b shows that 
the helices at the domain Ml interface 
and the loops at the domain ll-lll inter- 
face are also conserved. Note in c the 
helix packing of six around one in 
domain I. d, The solvent channel in the 
C222 x lattice viewed along the c axis. 
One half of the unit cell thickness is 
shown, containing four molecules. The 
other half of the cell is related to this 
by a two-fold rotation about horizontal 
axes (blue lines) at (§, y, ± J). The stack- 
ing of both layers leaves solvent chan- 
nels that traverse the cell along the c 
direction. The N terminus of the 
molecule (arrow) is accessible from 
these channels. 
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FIG. 3 The seven-helix bundle, a. Helical nets 
showing the position of amino-acid residues along 
the 7 helices: a y (63-79); a 2 (a 2a , 85-98 and 
« 2b , 104-117), 0-3 (123-152). a A (160-185). i* 5 
(193-214). a 6 (222-254) and a 7 (259-285). The 
cylindrical surface of the helices are cut longi- 
tudinally on the side facing the solvent and flat- 
tened to give a view from the interior of the bundle. 
The top of the drawing corresponds to the bulky 
end of the whole molecule. Owing to tilting of the 
outer helices, different helices are in register verti- 
cally only at a level indicated by two arrows pointed 
at a 1 and a 7 ; a 5 is the central helix. Dotted curves 
outline the strip of hydrophobic residues down the 
inward surface of the other six helices. 0, Ca trace 
(stereoview) for the bundle viewed perpendicular 
to « 5 . The relative tilt of the outer helices to a 5 
and that between adjacent outer heleices are both 
about 20*. The Ca trace is shaded grey over 
helices ol to a 3 in the back, striped over helix 
a 5 in the centre, and white over helices a 4. a 5. 
and a 7 in the front, a Cross-section of the bundle 
at the level indicated by the arrows in a. viewed 
from the bulky end of the molecule. The hellical 
backbone is represented by curly ribbons passing 
through the Ca positions. The outer helices are 
positioned roughly hexagonally around the central 
one and tilted relative to it, so the bundle forms 
a left-handed superhelix. The aromatic side chains 
are packed in an edge-to-face fashion. Hydrogen 
bonds are shown for side-chain atoms. 






terminus 19,32 where about 50 residues are removed. The activated 
6-endotoxins show a conserved C-terminus, so-called sequence 
block 5 (ref. 1). Its position as the middle strand of the buried 
P sheet in domain III precludes further processing from the C 
terminus. In fact deletion from this site by 4 to 8 residues results 
in inactive mutants with altered solubility and immuno- 



genic^ 



,30,33-35 



. This is not surprising as the inner sheet can be 



expected to play a critical part in the structural integrity and 
stability of the toxins through interaction with the helical bundle. 

At the N-terminal cleavage sites the different protoxin 
sequences show locally similar hydropathy profiles 1637 , which 
would be consistent with a common topology for the N-terminal 
region of the activated toxins as seen in the helical bundle of 
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the beetle toxin. In crystals of the beetle toxin, the N terminus 
at the start of helix a, borders on a large solvent channel of 
about 30 A diameter that crosses the unit cell along the c direc- 
tion (Fig. 2d). This channel could allow access of sporulation- 
associated proteases to the cleavage site in parasporal crystals' 9 . 
Receptor binding. The insecticidal selectivity of S-endotoxins is 
due to high-affinity binding to specific membrane receptors 7 " 9,38 , 
which in three cases seem to be glycoproteins 3 "" 40 . For several 
5-endotoxins the specificity-determining regions have been 
delimited by exchanging sequence segments between closely 
related toxins of differing specificities 13 " 15 . Guided by the loca- 
tion of secondary structures in the beetle toxin, a plaus- 
ible alignment of 5-endotoxin sequences was made for the non- 
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© 10Q1 Nature Pi ihllahinnftrnnn 



ARTICLES 







Domain II 




Sheet 1 P5 

P3 ^> S»*=» P |heet2 
P4\» _ ^ pe 



pio 



P9 



9 



Sheet 1 P 5 
P2 ^6 
P 3 <*> «^lheet2 

J\ t P9 

f p'° prr 

j(0 *fi Sheet 3 



9 



2 



FIG. 4 The three sheets of domain II. a. Schematic 
ribbon drawing of sheets 1, 2. and 3 laid side-by- 
side. Each is viewed from the exterior of the 
domain. Note the Greek-key topology of sheets 1 
and 2 and the similarity between their fold. b. 
Hydrogen bonding of the polypeptide backbone for 
the three sheets. The p strands are shown by the 
main-chain atoms and by the residue numbers at 
their ends; connecting strands are shown as coils. 
c, Cq trace of the three assembled sheets in 
domain II viewed towards domain I (stereoview). 
The Cor trace is shaded grey over sheet I, striped 
over sheet ll t and white over sheet 111. d. Cross- 
section of domain II (stereoview) showing the 
packing of three sheets in a triangle around the 
hydrophobic core. The view is towards domain III. 



conserved regions (ref. 12, and T. C. Hodgman, unpublished 
results). Hence the genetically identified specificity-determining 
regions can be mapped to equivalent positions in the beetle 
toxin structure, and these fall mainly in domain II. For instance, 
the dual specificity of Cry II A for Lepidoptera and Diptera, as 
distinct from the Lepidoptera specificity in the closely related 
CryllB, is determined by residues 307-382 of their sequences 14 , 
which corresponds roughly to sheet 1 (Fig. 4a) plus strand /3 6 
in sheet 2 and the loop leading up to 0 7 , whereas the Lepidoptera 
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specificity of CryllB is dependent on a longer segment 14 that 
would include both inner strands of sheet 2. Similarly, the 
toxicities of CrylA(a) and CrylA(c) to two lepidopteran insects 
depend on three segments termed x, y and z (ref. 15): amino-acid 
substitutions in y can reduce toxicity by up to 2,000-fold, and 
segments x and y interact in determining specificity. Aligned 
with the beetle toxin structure, segment x corresponds roughly 
to the outer strands /3 4 and p s of sheet 1 and the whole of sheet 
2, including the loop entering /3, 0 in sheet 3; y corresponds to 
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Domain III 

FIG. 5 Domain III, schematic ribbon representation of the 0 sandwich. 0 
strands forming the inner sheet are shaded grey. The topology of an 
eight-stranded 'jelly-roll' can be seen by following the 0 hairpin starting 
with 0 13 , /3 15 and /3 23 in the inner sheet, continuing to p 1G and (3 22 in the 
outer sheet, then /3 17 and /3 21 . p x in the inner sheet, and ending with 0 18 
and /3 19 in the outer sheet. /3 14 is an excursion from the hairpin and forms 
a fifth antiparallel strand of the outer sheet. Small parallel 0 sheets are 
added to one edge of the p sandwich, by hydrogen bonding of p x to p 13 
in the inner sheet and /3 12 to p 1G in the outer sheet. Residue numbers in 
the P strands are: p 12 , 502-506; p 13 , 509-513; 0 14 , 519-525; 0 15 , 
536-541; 0 16 . 547-554; /3 17 , 558-569; p 16 . 573-579; P 19> 585-591; 
020. 604-609; p 21 . 611-614; P 22 , 619-625; and p 23 , 631-643. 

strand p l0 of sheet 3 and the loop connecting p w and /3 M ; and 
z extends from p n to the C-terminal activation site. Furthermore, 
the interaction between x and y can be understood in terms of 
the proximity between j3 4 on the edge of sheet 1 and p x0 on the 



edge of sheet 3. Although z was inferred 15 to extend into 
domain III, the combined evidence from genetics and receptor- 
binding assays in vitro for Lepidoptera toxins 9,41 correlates 
receptor recognition with sequence variations within domain 
II. We note that the p ribbons from all three sheets terminate 
in loops in a small region on the molecular apex, in a man- 
ner reminiscent of the complementarity-determining region of 
immunoglobins. 

Pore formation. The common mechanism of epithelial cell disrup- 
tion by 5-endotoxins of widely different specificities is believed 
to be the formation of lytic pores of 10 to 20 A diameter in the 
insect membrane 10 . The structure of the beetle toxin displays 
an apparatus for pore formation in the long, hydrophobic and 
amphipathic helices of domain I which could penetrate the 
membrane. Between the crystal structure in which the bouquet- 
like helical bundle internalizes all the hydrophobic surfaces, 
and the unknown pore structure where hydrophobic surfaces 
would be in intimate contact with the membrane lipids, large 
conformation changes must occur. In the absence of a full 
characterization of the pore-forming process, we propose the 
following by extrapolation from the crystal structure. 

The trigger for the conformational changes may be provided 
by receptor binding and the consequent interaction of toxin 
with the membrane bilayer. Membrane insertion follows rapidly, 
so that a major part of the bound 5-endotoxin cannot be dis- 
placed from the brush-border vesicles by other toxins recogniz- 
ing the same receptor sites 7,9 . As domain II and probably its 
apical region are most likely to bind the membrane receptors, 
the helices are expected to insert with the "domain II end* (see 
Fig. 2a) oriented towards the cytoplasm. If helical hairpins are 
to initiate the membrane penetration, as probably happens for 
colicin 28,42,43 , they will probably be linked at the domain II end. 
So either of the helix pairs a 6 -a 7 or a 4 -a 5 could be the likely 
initiator. The a 6 -a 7 pair is favoured because it forms part of 
the conserved interface with domain II and is well positioned 
to sense the receptor binding. On the other hand, helix a s is 
the most conserved throughout the family of 5-endotoxins. Point 
mutations in a s reduce toxicity of a Lepidoptera toxin without 
reducing binding to membranes 44 . Proteolysis in the interhelical 
loops at the domain III end, as in the a 3 -a 4 loop 19,32 , may 
facilitate release of the helix pairs from the tertiary structure of 
the bundle. The insertion of a hairpin can create a defect in the 
membrane, allowing the rest of domain I to participate in pore 
formation in a cooperative manner. □ 
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Summary 

Background: Genetically modified (GM) crops that ex- 
press insecticidal protein toxins are an integral part of 
modem agriculture. Proteins produced by Bacillus thur- 
ingiensis (Bt) during sporulation mediate the pathoge- 
nicity of Bt toward a spectrum of insect larvae whose 
breadth depends upon the Bt strain. These transmem- 
brane channel-forming toxins are stored in Bt as crystal- 
line inclusions called Cry proteins. These proteins are 
the active agents used in the majority of biorational 
pesticides and insect-resistant transgenic crops. Though 
Bt toxins are promising as a crop protection alternative 
and are ecologically friendlier than synthetic organic 
pesticides, resistance to Bt toxins by insects is recog- 
nized as a potential limitation to their application. 

Results: We have determined the 2.2 A crystal structure 
of the Cry2Aa protoxin by multiple isomorphous replace- 
ment. This is the first crystal structure of a Cry toxin 
specific to Diptera (mosquitoes and flies) and the first 
structure of a Cry toxin with high activity against larvae 
from two insect orders, Lepidoptera (moths and butter- 
flies) and Diptera. Cry2Aa also provides the first struc- 
ture of the proregion of a Cry toxin that is cleaved to 
generate the membrane-active toxin in the larval gut. 

Conclusions: The crystal structure of Cry2Aa reported 
here, together with chimeric-scanning and domain- 
swapping mutagenesis, defines the putative receptor 
binding epitope on the toxin and so may allow for alter- 
ation of specificity to combat resistance or to minimize 
collateral effects on nontarget species. The putative re- 
ceptor binding epitope of Cry2Aa identified in this study 
differs from that inferred from previous structural studies 
of other Cry toxins. 

Introduction 

The almost 20 million hectares of GM crop fields in North 
America consist of crops engineered for herbicide or 
insect resistance. The genes that confer the latter trait 
come from Bacillus thuringiensis (Bt), a family of Gram- 
positive sporulating soil bacteria that produce para- 
sporal crystals with insecticidal activity. The insecticidal 
activity of particular Bt isolates is directed against nar- 
row spectra of insect larval species, usually within a 

3 Correspondence: stroud@msg.ucsf.edu 

* Present address: Maxygen, Redwood City, California, 94063. 



single order. Bacterial toxins known as insecticidal crys- 
tal proteins (ICPs) or crystalline (Cry) proteins that are 
sequestered as protoxins in crystalline inclusions after 
sporulation mediate this species-specific pathogenicity 
[1]. The Cry protoxins are ingested, solubilized in the 
larval gut [2, 3], and activated by the removal of an 
amino-terminal segment and a C-terminal segment, the 
size of which depends on the gene or its protoxin [2, 4]. 
The active toxins associate with insect-specific recep- 
tors of gut epithelial cells of the target insect [5] and 
subsequently insert into the cell membrane [6, 7], lead- 
ing to the formation of ion channels [8, 9, 1 0]. This results 
in disruption of the electrochemical balance across the 
basal membrane, gut paralysis, and larval death [11,12, 
13, 14], The host cadaver serves as growth medium 
for vegetative cells arising from germination of the Bt 
endospores. 

Species selectivity of Cry proteins is encoded in the 
binding site for the target receptor [5]. Classification of 
the Cry proteins is based on amino acid sequence iden- 
tity [15] and is roughly correlated with the taxonomic 
order of susceptible insect species, spanning species 
of agricultural (Cry1 Lepidoptera, Cry2 Lepidoptera, and 
Cry3 Coleoptera) and public health (Cry2 and Cry4 Dip- 
tera) significance. The structure may help guide muta- 
genesis followed by screening that is directed toward 
the fine tuning of species selectivity in order to design 
insecticides that do not kill nontarget organisms such 
as monarch larvae [1 6]. It also may assist in the elucida- 
tion of the structural basis of resistance to Bt toxins and 
the subsequent generation of novel insecticidal toxins 
for use on Bt-resistant insects [17, 18], 

Structure-based protein engineering of Cry toxins 
may direct the search for variants with broader suscepti- 
ble species spectra, optimal potency, and stability prop- 
erties. Cry2Aa is among an unusual subset of Cry pro- 
teins possessing broad insect species specificity by 
exhibiting high specific activity against two insect or- 
ders, Lepidoptera and Diptera [19, 20]. It is lethal to 
more lepidopteran species than the Cry1 toxins de- 
ployed against agriculturally important Lepidoptera [21] 
and exhibits a low level of crossresistance in CrylA- 
resistant insects [22]. Also, the mode of action of Cry2Aa 
may be distinct from that of other Cry toxins [23]. Thus, 
it could serve as a platform for the design of Cry toxins 
with broader susceptible species spectra and minimal 
CrylA-derived crossresistance in the field. 

Chimeric-scanning mutagenesis experiments have 
identified disjoint blocks (D and L, see Results and Dis- 
cussion) of the Cry2Aa sequence that separately confer 
specificity against dipteran (D) and lepidopteran (L) spe- 
cies [24, 25]. These experiments also demonstrate that 
maximal activity against lepidopteran species requires 
not only L block residues but also some of the specificity 
determinants of the D residue block. Further, Cry2Ab, 
an 87% sequence identical homolog of Cry2Aa, has 

Key words: Bacillus thuringiensis; delta-endotoxin; Cry2Aa; binding 
epitope; crystal structure; X-ray 
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Table 1. Data Collection, MIR Phasing, and Structure Refinement Statistics for Cry2Aa 





Native 


U(NOa) 


PtCNS 


Ptl 8 


NbCI 2 


Ru 1 


Hg a 


Data Collection (1 .08 A) 


Resolution (A) 


2.2 


2.6 


2.5 


2.6 


2.6 


2.5 


2.5 


Unit cell dimensions (A) 


a = b = 85.6 
















c = 163.9 














Space group 


P4 3 2 t 2 














Number of observed reflections (o> - 2.5) 


245,580 


69,703 


139,057 


70,618 


73,949 


113,242 


126,930 


Number of unique reflections 


31,591 


17,370 


20,476 


17,999 


17,455 


19,475 


20,198 


Completeness (%) 


99.3 


89.1 


99.0 


92. 


89.3 


94.7 


97.0 


FW (%) 


5.7 


5.1 


5.7 


4.7 


4.3 


5.4 


6.3 


Phasing/MIR 


Resolution 




2.6 


2.6 


2.6 


4.5 


2.6 


5.25 


Number of sites refined 




5 


6 


7 


6 


5 


3 


Number of reflections ((t f 3) 




16,177 


17,808 


16,516 


3,136 


17,074 


2,419 


R*> (%) 




16 


24 


32 


10 


10 


16 


ReuHs 




.62 


.62 


.59 


.60 


.67 


.62 


Rknwl 




.13 


.15 


.20 


.06 


.08 


.09 


Phasing power 




1.1 


1.9 


1.8 


1.4 


0.8 


1.2 


<FOM> cartric 3 




.36 


.39 


.41 


.41 


.30 
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3 Individual data set results. 
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negligible activity against dipteran species and 3- to 
8-fold less activity against certain lepidopteran species 
[25, 26]. Hence, Cry 2Aa structure and mutagenesis data 
provide the basis for future protein engineering of Cry 
toxins with modified specificity and selectivity profiles. 

To understand the structural determinants of Cry toxin 
specificity, we determined the crystal structure of the 
protoxin of Cry2Aa from Bacillus thuringiensis subsp. 
kurstaki. The complete structure was determined by 
multiple isomorphous replacement and refined to 2.2 A 
resolution. We have identified a candidate toxin receptor 
binding surface that is consistent with available chime- 
rioscanning mutagenesis data. 

Results and Discussion 

The structure of Cry2Aa from Bacillus thuringiensis 
subsp. kurstaki was determined by multiple isomor- 
phous replacement using six heavy atom derivatives 
and was refined to 2.2 A resolution with R^, = 18% 
(Table 1). The structure of the 633-amino acid protoxin 
contains the N-terminal 49-amino acid peptide that is 
cleaved upon activation and the three domains of what 
will become the mature toxin [27]. The structures of the 
three domains are surprisingly similar in overall topology 
(Figure 1a) to those of the activated toxins Cry3Aa [28] 
and Cry1 Aa [29], suggesting that removal of the activa- 



tion peptide serves to expose regions of the toxin rather 
than alter its conformation. This structural homology is 
also surprising since these toxins have little sequence 
identity to Cry2Aa (20% and 17%, respectively). In the 
mature toxin, the N-terminal domain (residues 1-272) is 
a pore-forming seven-helical bundle (Figure 1d) [1]. The 
second domain (residues 273-473) is a receptor binding 
p prism, a three-fold symmetric arrangement of p 
sheets, each with a Greek key fold (Figure 1 e). The third 
domain (residues 474-633) is implicated in determining 
both larval receptor binding [30, 31] and pore function 
[32] and is a lectin-like C-terminal p sandwich (Figure 1f). 

Available chimeric-scanning mutagenesis data [24, 
25] define a candidate toxin-receptor binding surface 
on Cry2Aa that is comprised of a distribution of hy- 
drophobic residues (Ile474-Ala477 from pi2a, Val365- 
Leu369 from the p5-p6 loop, and Leu402-Leu404 from 
the p7-p8 loop) across the solvent-exposed surface of 
the p prism and p sandwich domains (Figure 1 b). Proteo- 
lytic activation of the toxin involves the removal of the 
49 N-terminal amino acids and exposes residues com- 
prising this putative toxin-receptor binding surface. Re- 
moval of the 49 amino terminal residues, comprised of 
a0, aOa, and an N-terminal coil, would not affect the 
structure of the seven-helical membrane insertion do- 
main, as seen by comparing the structures of the acti- 
vated toxin CrylAa and that of the protoxin Cry2Aa. 
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Figure 1 . Topology and Solvent Accessible 
Surface of Cry2Aa 

(a) Ribbon diagram, rendered by Midas Plus 
[48], of Cry2Aa. Domain I is shown in ma- 
genta, domain II is shown in blue, and domain 
III is shown in cyan. The N terminus is shown 
in red, while functionally important loops de- 
limiting the putative toxin-receptor binding 
epitope are shown in green. A Cry2Aa inser- 
tion, relative to Cry3Aa and CrylAa, before 
p12 at the N terminus of domain III is shown 
in magenta. Numbered p strands referred to 
in the text are labeled. 

(b) The solvent accessible surface, as calcu- 
lated by GRASP [49], of domains II and III of 
Cry2Aa. The orientation is identical to that 
shown in Figure 1a The projection of residue 
hydrophobicity onto this surface is shown in 
color. Portions of the hydrophobic surface 
contributed by residues 474, 476, and 477 are 
shown in cyan, those contributed by residues 
365-369 are shown in blue, those contributed 
by residues 402 and 404 are shown in ma- 
genta, and the remainder of the surface con- 
tributed by hydrophobic residues is shown in 
yellow. The remaining surface that is identi- 
fied as nonhydrophobic is colored white. Res- 



idue hydrophobicity is as defined by GRASP [49]. The prominent hydrophobic patch is the center of the putative toxin-receptor binding 
epitope. For orientation, the portion of the surface contributed by residue 357 of the (J4-05 loop is shown in red. 

(c) The solvent accessible surface (as calculated by GRASP) of domains II and III of Cry2Aa. The orientation is identical to that shown in 
Figure 1 a. The projection of residue hydrophobicity onto this surface is shown in yellow, while the N terminus is shown in red; the N terminus 
sterically hinders access to the putative toxin-receptor binding epitope. Portions of the surface that are identified as nonhydrophobic are 
colored white. 

(d-f) The three domains of Cry2Aa shown in the same orientation as in Figure 1a. Labels with amino acid numbers identify the visible N and 
C termini of each domain in the figures. 



This is also expected since constructs consisting of 
the N-terminal-helical domain of the complete Cry3Ba1 
protoxin (prior to cleavage) are capable of nonreceptor- 
mediated partitioning into lipid bilayers [33], as is the 
activated toxin. 

The structure of Cry2Aa suggests that the N-terminal 
residues should sterically hinder access to the putative 
binding epitope p5-p6 and P7-P8 loops (Figure 1a, 
shown in green) and the exposed parts of domain III 
closest to domain II. Projection of hydrophobicity onto 
the solvent accessible surface of domains II and III re- 
veals an 800 A 2 hydrophobic patch (Figure 1 b) proximal 
to these loops. However, while the structure suggests 
that the 49 N-terminal residues (a0, aOa, and the N-ter- 
minal coil) should sterically hinder access to the putative 
binding epitope, the biological rationale for this function 
is unclear. It is unlikely that Bt possesses a receptor 
with affinity for the activated toxin. Hence, it does not 
seem likely that the N terminus serves to prevent prema- 
ture activation of the toxin within Bt. One simple expla- 
nation is that occlusion of the hydrophobic patch of the 
putative binding epitope prevents nonspecific aggrega- 
tion of the toxin with itself or other host proteins. Another 
explanation is that the N-terminal amino acids play a role 
in the formation of the environmentally stable crystalline 
inclusions. 

The specificity-distinguishing residues are also indi- 
cated by comparison of the Cry2Aa structure with the 
structure of the highly homologous (87% sequence iden- 
tity) Cry2Ab that is inactive against some Cry2Aa target 



species (Figure 2a). Chimeric-scanning mutagenesis 
[24, 25] defines a continuous 106 amino acid block, 
307-41 2, of specificity-distinguishing residues. (Specifi- 
cally, [25] demonstrated that substitution of residues 
278-340 resulted in loss of dipteran-specific activity in 
Cry2Aa, while [24] demonstrated that substitution of res- 
idues 307-382 conferred dipteran-specific activity to 
Cry2Ab. Thus, in our discussion, we adopt residue 307 
as the N-terminal boundary of the specificity-conferring 
sequence in Cry2Aa.) Within these 106 amino acids, 
there are 23 residues that differ between Cry2Aa and 
Cry2Ab (sequence alignment presented in Figure 5). 
Most of the Cry2Aa-Cry2Ab amino acid differences lie 
within or about the domain ll/lll 800 A 2 hydrophobic 
patch (Figure 1 b) and surrounding residues from the p5- 
06, p7-p8, and p4-p5 loops (Figure 1a). The picture of 
the putative toxin-receptor binding surface that emerges 
is that of an 800 A 2 hydrophobic region surrounded by 
three loops, those joining p4-p5, p5-p6, and p7-p8, 
which are also a part of the putative binding site. The 
three loops contain hydrophilic side chains that may be 
involved in specific hydrogen bonding with the receptor 
and so signal a portion of the site that could be mutated 
both to probe these interactions and to alter specificity. 

The proximity of this surface to solvent-exposed loops 
of the lectin-like domain III is consistent with the finding 
that domain III plays a role in the fine tuning susceptibility 
of different species. This has been demonstrated by 
replacement of domain III [30, 31] to make chimeric 
toxins with altered specificity characteristics. The N-ter- 
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Figure 2. Space-Filling Representation of 
Cry2Aa Specificity-Conferring Residues, De- 
tail of Buried D Block Residues, and Electron 
Density Covering Buried D Block Residues 

(a) Space-filling model of Cry2Aa domains II 
and III with the N terminus and membrane- 
inserting domain I removed. The orientation 
reflects a -20° rotation relative to that shown 
in Figure 1 a. The results of chimeric-scanning 
[24, 25] mutagenesis experiments are pro- 
jected onto the van der Waals surface of 
Cry2Aa. The residues colored green and cyan 
are single amino acid differences between 
Cry2Aa and Cry2Ab in block L (residues 341- 
41 2). The residues colored yellow and orange 
are single amino acid differences between 
Cry2Aa and Cry2Ab in block D (residues 307- 
340). The bar represents an approximate 1 0 A 
scale. 

(b) Packing of D block residues behind the 
£4-p5 loop. The (J4-(J5 loop contains L block 
specificity determinants with which the bur- 
ied D block residues interact. 

(c) Electron density for the putative receptor 
binding site covering residues of the p sheet 
behind the p4-p5 loop. 



minal strand pi2a of domain III is not present in the 
three-dimensional structures of Cry1 Aa or Cry3Aa. The 
turn between this strand and the rest of domain III is 
functionally replaced almost exactly by a loop that con- 
nects p3 and p4 of domain II in the homologous Cry1 Aa 
and Cry3Aa structures (Figures 1a and 3, shown in ma- 




Figure 3. Detail of Ribbon Diagram Overlap of Cry2Aa and Cryl Aa 
The Cry1 Aa domains have been independently fit to those of Cry2Aa. 
The functionally important loops delimiting the putative toxin-recep- 
tor binding epitope are shown in green (Cry2Aa) and blue (Cry1 Aa). 
The Cry2Aa insertion, relative to Cry3Aa and CrylAa, before p12 
at the N terminus of domain III is shown in magenta, while the 
corresponding loop from CrylAa is shown in cyan (see arrow). 



genta). This functionally conserved pi 2a motif occupies 
the same region of the structure as the p3-p4 turn in 
Cry1 Aa and Cry3Aa f so it implies conservation of a func- 
tional role in protecting the hydrophobic portion of the 
putative receptor binding surface implicated by the ho- 
molog substitutions. 
Chimeric-scanning mutagenesis identifies fairly large 
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Figure 4. Schematic Representation of Chimeric-Scanning Muta- 
genesis Data 

The first and last bands represent the Cry2Aa and Cry2Ab se- 
quences, respectively. The middle bands represent chimeric combi- 
nations in which gray regions correspond to Cry2Ab sequence and 
white regions correspond toCry2Aa sequence. For all bands, except 
that corresponding to Hyb513, the three central vertical bars repre- 
sent amino acids 278, 340, and 412. For Hyb513, the two central 
vertical bars represent amino acids 307 and 382. The activity desig- 
nations represent an approximate log scale. For example, the ( + + +) 
activity designation for chimera DL112 corresponds to an ID^ of 
1 26 (85.7-1 87) ng, while the (+) designation for chimera DL1 1 5 cor- 
responds to an ID M of 3,200 (1 ,340-51 ,900) ng; the confidence inter- 
vals correspond to 2a. 
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Figures. Detail Sequence Alignment of 
Cry2Aa and Cry2Ab 

Sequence alignment of the D and L block 
regions of Cry 2Aa and Cry 2Ab generated us* 
ing ALSCRIPT [51 }. In the alignment, identical 
amino acids are unmarked; similar residues 
(as defined by ALSCRIPT) are colored yellow, 
while dissimilar residues are marked green. 
The secondary structure associated with se- 
quence is presented in the lowermost row. 
The block of secondary structure associated 
with D block residues is colored magenta, 
while that associated with L block residues 
is colored cyan. 



regions of the protein sequence that confer differential 
specificity to Diptera and Lepidoptera [25] (Figure 4). In 
Figure 4, the first band represents the sequence of 
Cry2Aa with its high level of activity (+ + +) against both 
Lepidoptera and Diptera. The last band represents the 

Cry2Ab sequence that exhibits negligible activity ( ) 

against Diptera and up to one order of magnitude lower 
activity against Lepidoptera when compared with 
Cry2Aa. The second band (DL1 1 2) represents a Cry2Aa 
chimera that contains the Cry2Ab sequence for the 
block D residues 307-340 (dipteran-specific). This chi- 
mera has negligible activity against Diptera and is sug- 
gested to have reduced activity (at the 1a level) against 
Lepidoptera, indicating that block D correlates with dip- 
teran specificity. The activity profile of a reverse chimera 
(the third band) [24], in which Cry2Ab contains the block 



D sequence from Cry2Aa, shows a more significant re- 
duction than DL112 against Lepidoptera (of a different 
species) but is only reduced 20-fold toward Diptera ver- 
sus Cry2Aa. Thus antidipteran activity tracks with the D 
block of Cry2Aa. 

The fourth band (DL1 1 5) represents a Cry2Aa chimera 
that contains the Cry2Ab sequence for the dipteran- 
disfavoring D block and for a neighboring region of se- 
quence, the lepidopteran-disfavoring L block (residues 
341-412). The activity profile of this construct against 
both Diptera and Lepidoptera most closely parallels that 
of Cry2Ab, which is consistent with blocks D and L 
encoding essentially all of the differential specificity de- 
terminants. In summary, the differential specificity for 
Diptera in Cry2Aa depends on block D, while that for 
Lepidoptera depends on block L. Maximal activity 



Table 2. Solvent Accessible Surface Areas, Contacts within 3.4 A, and Hydrogen Bonds for the Specificity-Conferring 
Residues in Cry2Aa 







Exposed Surface 




Residue 


Exposed Surface (A 2 ) 


Beyond C p (A 2 ) 


Contacts 


Dipteran Specificity-Conferring 


Ile307 


6 


4 


Ser309,Ser343,Gly481 , (Met483),CTyr342) 


Ser309 


26 


26 


Asn341 ,Ue307,Thr364,(Ser363) 


Ile311 


1 


0 


Cys362,(Arg339),(Asn361 ) 


Thr314 


7 


7 


Ser337,Asn357,Asn336,His358,(Asn359) 


Ile318 


91 


89 


Thr332,(Thr331) 


Gly324 


78 


0 




Ser334 


5 


5 


Leu31 6,Asn336,(Phe409),(Gln399) > (Arg31 5) 


Asn336 


6 


6 


Thr314,Ser334,Ala460,Ala353,(Gty31 3),(lle351 ) 


Ser337 


0 


0 


Thr314,Ala353 t His358 


Lepidopteran Specificity-Conferring 


Va!346 


39 


34 


Tyr342,{Asn303),(Gly344) 


Leu350 


27 


26 


Asn449,lle450 


Thr354 


50 


26 


Glu451 


Asn355 


109 


76 


(Pro457) 


Leu356 


60 


43 


(Ala353) 


His358 


43 


14 


Ser31 2,Thr314,Ser337,(Gly31 3) 


Val365 


107 


75 


(Asn336) 


Ser370 


68 


39 


Pro367,{His21) 


Thr382 


54 


24 


(Asn392),rrhr391) 


Ser390 


9 


9 


Ser329,Thr331 ,Asp383 


Gln399 


33 


33 


Val374,Arg375,Arg405,(Leu404) 


Ser403 


93 


73 




Cys406 


37 


27 


Ser397,Phe398,Cys362 


Ser410 


89 


72 





All data were calculated for the activated toxin using HBPLUS [50]. Entries in the left-most column are the 23 specificity-conferring residues. 
Entries in the right-most column conform to hydrogen bonding geometry, except for those enclosed in parentheses that are van der Waals 
contacts. Bold entries in the right-most column identify specificity-conferring residues also found in the left-most column. 
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GKP*: 5'-CCC ATG GAT AAT GTA TTG AAT AGT GQA AG-3* 

GKP-7: S'-CAA GCT TTA GGT TAA CTT GAA ATG A-3' 
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Figure 6. Restriction Maps Detailing the Con- 
struction of Plasmid pSB307 

(a) Nucleotide sequences of the oligonucleo- 
tides GKP-6 and GKP-7. 

(b) Restriction maps of pSB302, pSB304, and 
pSB307. 



against Lepidoptera, as seen in Cry2Aa, still requires 
some contribution from block D (sequence alignment 
presented in Figure 5). 

Figure 2a projects the Cry2Aa/Cry2Ab homolog differ- 
ences onto the van der Waals surface of Cry2Aa (for 
clarity, only domains II and III are shown). In the D block, 
there are nine residues that differ between Cry2Aa and 
Cry2Ab. Surprisingly, most of these are buried. The nota- 
ble exceptions are Ile318 and Gly324 (Asn and Val, re- 
spectively, in Cry2Ab), which are distant from the puta- 
tive binding epitope, and the moderately exposed 
Ser309 (Asn in Cry2Ab) within the putative binding epi- 
tope (Table 2). Ile307 and Ile311 are found packed be- 
hind exposed residues on the putative binding surface. 
Almost half of the variant residues from block D (Thr314, 
Ser334, Asn336, and Ser337) are in a cluster that is 
packed behind the p4-p5 loop presented from within 
the 72-residue L block (Figures 2b and 2c). 

Two of these buried variant residues, Thr314 and 
Ser337, make side chain-main chain hydrogen bonds 
with the p4-p5 loop. A third residue, Asn336, makes 
main chain-main chain hydrogen bonds with the p4-p5 
loop, and Thr31 4 makes side chain-side chain hydrogen 
bonds with Ser334. In the less active homolog, Cry2Ab, 
these residues are replaced with approximately isosteric 
nonhydrogen bonding residues, suggesting that this 
pattern of substitutions abolishes affinity for the dipteran 
receptor (Thr314Ala, Ser334Ala, Asn336Leu, and Ser33- 
7 Ala). It is conceivable that the lle318Val and Gly324Val 
substitutions are part of a region of the protein that 
interacts only with the receptors) found in dipteran spe- 
cies and shares some components with the putative 
binding epitope that we identify. However, we speculate 
that the same exposed surface area binds to the lepi- 
dopteran and dipteran receptors. In this model, these 
solvent inaccessible residues behind the putative recep- 
tor binding surface may serve to alter the conformation 
of the p4-p5 loop, with its several hydrophilic specificity- 
determining residues. Similar modulation of specificity 
in protein-protein interactions by noncontact residues 
is seen in the context of immunoglobin residues that 
affect conformation of the complementarity-determin- 
ing residues (CDR) at the binding surface [34]. Likewise, 
affinity maturation of a Fab/antigen complex results in 
the optimization of antibody/antigen binding by residues 
15 A from the interaction surface [35]. 

The structures of Bt toxins provide a template for 
design and discovery of changes that alter receptor 
targeting in order to either broaden selectivity for better 
field efficacy, prolong the life of existing agents, or avoid 



unwanted effects on nontarget organisms. Resistance 
to Bt toxins is recognized as a potential limitation in 
their application. Early studies concluded that recessive 
genes controlled the inheritance of Bt resistance. How- 
ever, a recent study suggests that Bt resistance can be 
inherited as an incompletely dominant autosomal gene 
[36]. The authors note that such a mechanism of Bt 
resistance inheritance in the field would significantly 
reduce the usefulness of the high dose/refuge strategy 
of resistance management in which some mates are 
not challenged with toxin. Knowledge of any presumed 
modifications in the receptor that cause resistance can 
potentially instruct rational protein engineering of the 
receptor binding surface to yield toxins that might by- 
pass resistance and still bind to the modified receptor 
of resistant insect species. 

Potential collateral effects upon nontarget insect spe- 
cies [36] and effects upon nontarget predatory insects 
that consume target insect species [37] have been at- 
tributed to Bt GM crops. The structures provide a blue- 
print for focused mutagenesis followed by screening to 
select for each specific target species in a particular 
crop, so as to diminish collateral toxicity to nontarget 
species. By shedding light on the molecular basis of 
toxin-host receptor recognition, the structure provides 
a foundation for engineering Bt-based toxin genes that 
may develop broader insect species specificity, species 
selectivity tuned to reduce collateral impact upon non- 
target species, and longer field efficacy. 

Biological Implications 

We have determined the three-dimensional structure of 
the insecticidal toxin Cry2Aa in order to understand the 
structural determinants of toxin specificity. Genetically 
modified (GM) crops that express insecticidal protein 
toxins are an integral part of modem agriculture. Pro- 
teins normally produced by different strains of Bacillus 
thuringiensis (Bt) during sporulation mediate a species- 
specific pathogenicity of Bt toward insect larvae of the 
target species and are the active agents in the majority 
of biorational pesticides and insect- resistant transgenic 
crops. Though promising as a crop protection alterna- 
tive, problems exist with transgenic crops. Bt GM crops 
may pose a threat to nontarget insect species [16] or to 
nontarget predatory insects that consume target insect 
species [37]. In addition, resistance to Bt toxins is recog- 
nized as a potential limitation to their application that 
is ecologically friendlier than traditional organic pesti- 
cides. For instance, EPA approval of Bt GM maize was 
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contingent upon the establishment of viable resistance 
management strategies [36]. 

Cry2Aa is among an unusual subset of crystalline (Cry) 
proteins possessing broad insect species specificity by 
exhibiting high specific activity against larvae from two 
insect orders, Lepidoptera and Diptera [24, 25], of ag- 
ricultural and public health significance. Also, the 
Cry2Aa protoxin is significantly smaller (72 kDa) than 
those of the Cry1 proteins (~135 kDa) in the current 
generation of transgenic crops. Since gene size can be a 
limiting factor of protein expression in plants, transgenic 
constructs based upon Cry1 usually express a smaller 
portion of the gene that contains essentially the acti- 
vated toxin. Cry protoxins are presumed to be more 
environmentally stable than the activated toxins; hence, 
transgenic constructs that express the Cry2Aa protoxin 
could deliver higher toxin doses in the field due to 
greater stability [22], Also, expression of the protoxin 
reduces collateral damage to nontarget insect species 
since it depends on specificity of the host proteases for 
activation [3, 37]. Chloroplast-directed overexpression 
of the Cry2Aa protoxin has been demonstrated and 
shows expression levels equivalent to 2%-3% total sol- 
uble protein in transformed leaves [22]. Such high levels 
of expression, 20- to 30-fold higher than current nuclear 
transgenics, could diminish the opportunity for devel- 
oping resistance by significantly increasing toxin dose 
at the initial encounter with the insect. 

Cry2Ab, an 87% sequence identical homolog of 
Cry2Aa, has negligible activity against dipteran species 
and 3- to 8-fold less activity against certain lepidopteran 
species [25, 26]. Also, there exists a unique body of 
chimeric-scanning mutagenesis data in the Cry2Aa/ 
Cry2Ab system that has identified determinants of spe- 
cies specificity in the amino acid sequence [24, 25]. 
Correlating the structure with chimeric-scanning data 
indicates that the putative receptor binding epitope of 
Cry2Aa lies on the core p sheet and differs from the end 
of the p sheet apical loops of domain II, as suggested 
from structures of the other Cry toxins [28, 29]. Thus, a 
target surface is defined for directed mutagenesis that 
may focus engineering of the toxin either to develop 
broader insect species specificity, species selectivity 
tuned to reduce collateral impact upon nontarget spe- 
cies, or longer field efficacy. Until now, the search for 
new insecticidal bacterial toxins involved collection and 
assay of novel isolates of Bt and other bacteria known 
to have insecticidal activity. Recent reports describe the 
isolation of bacterial species that produce new classes 
of insecticidal toxin [38]. These structure data may per- 
mit rational engineering of insecticidal Cry toxins with 
desired characteristics. 

Experimental Procedures 
Cloning of Cry2Aa 

Oligonucleotide primers flanking the coding region of cry2Aa were 
generated based on the published sequence of the gene from Bt 
kurstaki strain HD-1 [26]. Primer GKP-6 is a 29-mer that corresponds 
to the N-terminal 26 nucleotides of the coding region {Figure 6a). 
Primer GKP-7 is a 25-mer that corresponds to a fragment overlap- 
ping the Hindll I site that is located ~350 nucleotides downstream 
from the stop codon (Figure 6a). Plasmid DNA isolated from Bt 
kurstaki HD-1 served as a template for the PCR reaction. The re- 



sulting 21 00 bp fragment was purified and served as the probe used 
to identify the Cry2Aa operon with its accompanying open reading 
frames. The hexamer-primed labeling method was used to incorpo- 
rate ^P-dCTP into the probe. 

Previously, it was indicated that the entire gene, including the 
coding region and the promoter, is present on a 5.0 kb Hindlll frag- 
ment [26] of a plasmid isolated from Bt kurstaki HD-1 . The 3.5-7 kb 
fragments obtained by Hindlll digestion of plasmid DNA isolated 
from Bt kurstaki HD-1 were ligated into an £ coli cloning vector, 
pTZ18R (Pharmacia, vecbase accession #VB0071) and were used 
to transform £ coli DH5 cells by electroporation. Electroporated 
DH5 cells were plated onto LB -Amp 50 plates containing X-gal and 
IPTG for color selection. The presence of the cry2Aa gene in the 
transformed colonies (white) was confirmed by hybridization of the 
PCR-generated probe. Restriction analysis was used to confirm 
that the clones contained inserts with the cry2Aa gene and also to 
establish the orientation with which the fragment was inserted into 
pTZ1 8R. The results of this analysis revealed that one of the clones 
corresponded to the orientation designated pSB302 (Figure 6b), 
while two clones had the opposite orientation and were designated 
pSB303. pSB304 t obtained by deleting the 1.2 kb-BamHI fragment 
(dotted line in Figure 6b), was also transformed into DH5. 

Total protein analysis for proteins produced by E. coli strain DH5 
carrying pSB302, pSB303, or pSB304 was performed by SDS-PAGE. 
A protein band of molecular weight 62 kDa, absent in the original 
DH5 cells, was observed in all of the clones examined. The level of 
expression was the highest in those cells bearing pSB304. Most of 
the toxin could be found in the pelletable fraction following sonica- 
tion of the cells. Samples were evaluated for biological activity by 
bioassay using Manduca sexta as the target insect. All of the clones 
(pSB302, pSB303, and PSB304) were active with LDa, values of 
~500 ppm. 

The pSB304 plasmid retains a unique EcoR1 site, ~200 nucleo- 
tides upstream of the cry2Aa promoter, into which the EcoR1 -linear- 
ized Bacilius cereus vector pBC1 6.1 (GenBank accession number 
U32369) was cloned (Figure 6b). The resulting clone was used to 
transform E. coli DH5, and clones containing the new plasmid were 
designated pSB307. Confirmation of the identity of the new plasmid 
and determination of the orientation of the pBC16.1 insert, with 
respect to the cry2Aa gene, was made by restriction mapping. One 
of the plasmids, pSB307.4, was transformed into Bt cryB (a cry- 
strain) by electroporation. The plasmid content of these isolates 
was verified by restriction mapping. 

Cry2Aa expressed well in Bt cryB cells transformed with 
pSB307.4, and the protein formed crystalline (rhombohedral) inclu- 
sions. The cells were harvested by centrifugation, washed with wa- 
ter, and lyophilized. Dried cell mass was added to the insect diet and 
fed to M. sexta larvae. The results confirmed that Bt cryB (pSB307.4) 
exhibited high insecticidal activity. 

Protein Expression and Purification 

The plasmid (pSB307.4) containing the Cry2Aa operon, with its ac- 
companying open reading frames, was used to transform the cry- 
strain of Bt (cryB) as previously described [39]. Cry2Aa was purified 
from the crystalline inclusions produced in the cells. Inclusions were 
harvested by cell lysis and centrifugation. Crystalline inclusions 
were washed repeatedly with 0.5 N NaCI to remove proteases and 
were transferred to buffer (10 mM Tris-HCI, 1 mM EDTA [pH 8.0]) 
with 2% mercaptoethanol. Titrating the pH to 1 0.5, using NH 4 OH, 
solubilized the protein from the crystalline inclusion bodies. The 
protein was purified by Sephacryl S300HR column chromatography 
as described [40] and concentrated by ultrafiltration to 10 mg ml" 1 . 

Crystallization and Structure Determination 
For recrystallization, hanging drops of the resulting concentrated 
protein (10 ^\ concentrated protein buffered as described above) 
were equilibrated against wells that contained Tris buffer (10 mM 
Tris-HCI, 1 mM EDTA [pH 8.0]). Crystallization was induced by the 
gradual shift to neutral pH as the mobile NH 3 diffused from the 
drops. Crystals were transferred to storage buffer (50 mM PIPES, 
250 mM NaCI [pH 6.8]) with 2% mercaptoethanol. The resulting 
crystals are in spacegroup P4 3 2,2; unit cell constants a = 85.6 A, 
c = 163.9 A. They have one monomer in the asymmetric unit, an 
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estimated 34% solvent content, and diffract to ~3.0 A using Cu K« 
X-rays from a rotating anode generator and to 2.0 A at a synchrotron 
source after flash freezing. 

For the collection of data at 100K, the crystals were transferred 
in three steps to a final 20% solution of cryo- protectant (a 1 :1 mixture 
of 1 ,2-propane diol and glycerol) and storage buffer and flash frozen 
in a cold nitrogen stream. X-ray diffraction data were collected at 
SSRL beam! ine 7.1 using a wavelength of 1 .08 A. Intensity data were 
integrated, scaled, and merged using HKL [41]. The overall Wilson 
B factor (3.0 A < d < 2.2 A) was 14 A 2 . 

De novo phasing was achieved using multiple isomorphous re- 
placement after attempts to find a molecular replacement solution 
to the phase problem employing the available coordinates of Cry3Aa 
and CrylAa were unsuccessful. The heavy atom derivatives (Table 
1) were solved from difference Patterson maps as displayed using 
XtalView [42]. Difference Fourier inspection for minor sites and re- 
finement of the heavy atom positions, occupancies, and B factors 
was completed in PHASES [43]. The resulting protein electron den- 
sity map was subjected to solvent-flipping density modification, as 
implemented in Solomon [44]. The helical bundle was apparent in 
5 A maps; at 3 A resolution, the correct enantiomorph was clear from 
its stereochemistry. Using CrylAa as the initial building template, 
polyalanine versions of the helical and jellyroll domains were manu- 
ally positioned using 0 [45], and the fit was optimized using the real- 
space refinement package ESSENS [46]. Positional and simulated 
annealing refinement were carried out using the maximum likelihood 
target of XPLOR 3.85 x [47]. 
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